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EXPRESSED SEQUENCES OF ARABIDOPSIS THALIANA 

5 CROSS -REFERENCE TO RELATED APPLICATION 

This application claims the benefit of U.S. Provisional Application 60/178,512 
Filed January 27, 2000. 

FIELD OF INVENTION 
The invention is in the field of polynucleotide sequences of a plant, particularly 
10 sequences expressed in arabidopsis thaliana. 

Background of the Invention 
Plants and plant products have vast commercial importance in a wide variety 
of areas including food crops for human and animal consumption, flavor enhancers 

15 for food, and production of specialty chemicals for use in products such as 
medicaments and fragrances. In considering food crops for humans and livestock, 
genes such as those involved in a plant's resistance to insects, plant vimses, and 
fungi; genes involved in pollination; and genes whose products enhance the 
nutritional value of the food, are of major importance. A number of such genes have 

20 been described, see, for example, McCaskill and Croteau (1999) Nature Biotechnol. 
17:31-36. 

Despite recent advances in methods for identification, cloning, and 
characterization of genes, much remains to be learned about plant physiology in 
general, including how plants produce many of the above-mentioned products; 

25 mechanisms for resistance to herbicides, insects, plant viruses, fungi; elucidation of 
genes involved in specific biosynthetic pathways; and genes involved in 
environmental tolerance, e.g., salt tolerance, drought tolerance, or tolerance to 
anaerobic conditions. 

Arabidopsis thaliana is a model system for genetic, molecular and biochemical 

30 studies of higher plants. Features of this plant that make it a model system for 
genetic and molecular biology research include a small genome size, organized into 



five chromosomes and containing an estimated 20,000 genes, a rapid life cycle, 
prolific seed production and, since it is small, it can easily be cultivation in limited 
space. A, thaliana is a member of the mustard family {Brassicaceae) with a broad 
natural distribution throughout Europe, Asia, and North America. Many different 
5 ecotypes have been collected from natural populations and are available for 
experimental analysis. The entire life cycle, including seed germination, formation of 
a rosette plant, bolting of the main stem, flowering, and maturation of the first seeds, 
is completed in 6 weeks. A large number of mutant lines are available that affect 
nearly all aspects of its growth. These features greatly facilitate the isolation of 

1 0 fundamentally interesting and potentially important genes for agronomic development 
Most gene products from higher plants exhibit adequate sequence similarity to 
deduced amino acid sequences of other plant genes to permit assignment of 
probable gene function, if it is known, in any higher plant. It is likely that there will be 
very few protein-encoding angiosperm genes that do not have orthologs or paralogs 

15 in Arabidopsis, The developmental diversity of higher plants may be largely due to 
changes in the cis-regulatory sequences of transcriptional regulators and not in 
coding sequences. 

Many advances reported over the past few years offer clear evidence that this 
plant is not only a very important model species for basic research, but also 
20 extremely valuable for applied plant scientists and plant breeders. Knowledge 
gained from Arabidopsis can be used directly to develop desired traits in plants of 
other species. 

Relevant Literature 

25 Cold Spring Harbor Monograph 27 (1994) E.M. Meyerowitz and C.R. 

Somen/ilie, eds. (CSH Laboratory Press). Annual Plant Reviews, Vol. 1: Arabidopsis 
(1998) M. Anderson and J.A. Roberts, eds. (CRC Press). Methods in Molecular 
Biology: Arabidopsis Protocols, Vol. 82 (1997) J.M. Martinez-Zapater and J. Salinas, 
eds. (CRC Press). 

30 Mayer et al (1999) Nature 402(6763):769-77; "Sequence and analysis of 

chromosome 4 of the plant Arabidopsis thaliana". Lin et ai (1999) 402(6763):761-8, 
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"Sequence and analysis of chronnosonne 2 of the plant Arabidopsis thaliana". Meinke 
et ai (1998) Science 282:662-682, "Arabidopsis thaliana: a model plant for genome 
analysis". Somerville and Somerviiie (1999) Science 285:380-383, "Plant functional 
genomics". Mozo et al. (1999) Nat. Genet . 22:271-275, "A complete BAC-based 
5 physical map of the Arabidopsis thaliana genome". 

Summary of the Invention 
Novel nucleic acid sequences of Arabidopsis thaliana, their encoded 
polypeptides and variants thereof, genes corresponding to these nucleic acids, and 
1 0 proteins expressed by the genes, are provided. 

The invention also provides diagnostic, prophylactic and therapeutic agents 
employing such novel nucleic acids, their corresponding genes or gene products, 
including expression constructs, probes, antisense constructs, and the like. The 
genetic sequences may also be used for the genetic manipulation of plant cells, 
15 particularly dicotyledonous plants. The encoded gene products and modified 
organisms are useful for introducing or improving disease resistance and stress 
tolerance into plants; screening of biologically active agents, e.g. fungicides, efc; for 
elucidating biochemical pathways; and the like. 

In one embodiment of the invention, a nucleic acid is provided that comprises 
20 a start codon; an optional intervening sequence; a coding sequence capable of 
hybridizing under stringent conditions as set forth in SEQ ID N0:1 to 999; and an 
optional terminal sequence, wherein at least one of said optional sequences is 
present. Such a nucleic acid may correspond to naturally occurring Arabidopsis 
expressed sequences. 

25 

Detailed Description of the Invention 
Novel nucleic acid sequences from Arabidopsis thaliana, their encoded 
polypeptides and variants thereof, genes corresponding to these nucleic acids and 
proteins expressed by the genes are provided. The invention also provides agents 
30 employing such novel nucleic acids, their corresponding genes or gene products, 
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including expression constructs, probes, antisense constructs, and the like. The 
nucleotide sequences are provided in the attached SEQLIST. 

Sequences include, but are not limited to, sequences that encode resistance 
proteins; sequences that encode tolerance factors; sequences encoding proteins or 

5 other factors that are involved, directly or indirectly in biochemical pathways such as 
metabolic or biosynthetic pathways, sequences involved in signal transduction, 
sequences involved in the regulation of gene expression, structural genes, and the 
like. Biosynthetic pathways of interest include, but are not limited to, biosynthetic 
pathways whose product (which may be an end product or an intermediate) is of 

10 commercial, nutritional, or medicinal value. 

The sequences may be used in screening assays of various plant strains to 
determine the strains that are best capable of withstanding a particular disease or 
environmental stress. Sequences encoding activators and resistance proteins may 
be introduced into plants that are deficient in these sequences. Alternatively, the 

15 sequences may be introduced under the control of promoters that are convenient for 
induction of expression. The protein products may be used in screening programs 
for insecticides, fungicides and antibiotics to determine agents that mimic or enhance 
the resistance proteins. Such agents may be used in improved methods of treating 
crops to prevent or treat disease. The protein products may also be used in 

20 screening programs to identify agents which mimic or enhance the action of tolerance 
factors. Such agents may be used in improved methods of treating crops to enhance 
their tolerance to environmental stresses. 

Still other embodiments of the invention provide methods for enhancing or 
inhibiting production of a biosynthetic product in a plant by introducing a nucleic acid 

25 of the invention into a plant cell, where the nucleic acid comprises sequences 
encoding a factor which is involved, directly or indirectly in a biosynthetic pathway 
whose products are of commercial, nutritional, or medicinal value include any factor, 
usually a protein or peptide, which regulates such a biosynthetic pathway; which is an 
intermediate in such a biosynthetic pathway; or which in itself is a product that 

30 increases the nutritional value of a food product; or which is a medicinal product; or 
which is any product of commercial value. 



Transgenic plants containing the antisense nucleic acids of the invention are 
useful for identifying other mediators that may induce expression of proteins of 
interest; for establishing the extent to which any specific insect and/or pathogen is 
responsible for damage of a particular plant; for identifying other mediators that may 

5 enhance or induce tolerance to environmental stress; for identifying factors involved 
in biosynthetic pathways of nutritional, commercial, or medicinal value; or for 
identifying products of nutritional, commercial, or medicinal value. 

In still other embodiments, the invention provides transgenic plants 
constructed by introducing a subject nucleic acid of the invention into a plant cell, and 

10 growing the cell into a callus and then into a plant; or, alternatively by breeding a 
transgenic plant from the subject process with a second plant to form an F1 or higher 
hybrid. The subject transgenic plants and progeny are used as crops for their 
enhanced disease resistance, enhanced traits of interest, for example size or flavor 
of fruit, length of growth cycle, etc., or for screening programs, e.g. to determine more 

15 effective insecticides, etc; used as crops which exhibit enhanced tolerance 
environmental stress; or used to produce a factor. 

Those skilled in the art will recognize the agricultural advantages inherent in 
plants constructed to have either increased or decreased expression of resistance 
proteins; or increased or decreased tolerance to environmental factors; or which 

20 produce or over-produce one or more factors involved in a biosynthetic pathway 
whose product is of commercial, nutritional, or medicinal value. For example, such 
plants may have increased resistance to attack by predators, insects, pathogens, 
microorganisms, herbivores, mechanical damage and the like; may be more tolerant 
to environmental stress, e.g. may be better able to withstand drought conditions, 

25 freezing, and the like; or may produce a product not normally made in the plant, or 
may produce a product in higher than nonnal amounts, where the product has 
commercial, nutritional, or medicinal value. Plants which may be useful include 
dicotyledons and monocotyledons. Representative examples of plants in which the 
provided sequences may be useful include tomato, potato, tobacco, cotton, soybean, 

30 alfalfa, rape, and the like. Monocotyledons, more particularly grasses {Poaceae 
family) of interest, include, without limitation, Avena sativa (oat); Avena strigosa 
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(black oat); Elymus (wild rye); Hordeum sp. including Hordeum vulgare (barley); 
Oryza sp., including Oryza glaberrima (African rice); Oryza longistaminata (long- 
staminate rice); Pennisetum americanum (pearl millet); Sorghum sp. (sorghum); 
Triticum sp., including Triticum aestivum (common wheat); Triticum durum (durum 
5 wheat); Zea mays (corn); etc. 

Nucleic ACID Compositions 
The following detailed description describes the nucleic acid compositions 

encompassed by the invention, methods for obtaining cDNA or genomic DNA 
10 encoding a full-length gene product, expression of these nucleic acids and genes; 

identification of structural motifs of the nucleic acids and genes; identification of the 

function of a gene product encoded by a gene corresponding to a nucleic acid of the 

invention; use of the provided nucleic acids as probes, in mapping, and in diagnosis; 

use of the corresponding polypeptides and other gene products to raise antibodies; 
15 use of the nucleic acids in genetic modification of plant and other species; and use of 

the nucleic acids, their encoded gene products, and modified organisms, for 

screening and diagnostic purposes. 

The scope of the invention with respect to nucleic acid compositions includes, 

but is not necessarily limited to, nucleic acids having a sequence set forth in any one 
20 of SEQ ID NOS:1-999; nucleic acids that hybridize the provided sequences under 

stringent conditions; genes corresponding to the provided nucleic acids; variants of 

the provided nucleic acids and their corresponding genes, particularly those variants 

that retain a biological activity of the encoded gene product. 

In one embodiment, the sequences of the invention provide a polypeptide 
25 coding sequence. The polypeptide coding sequence may con-espond to a naturally 

expressed mRNA in Arabidopsis or other species, or may encode a fusion protein 

between one of the provided sequences and an exogenous protein coding sequence. 

The coding sequence is characterized by an ATG start codon, a lack of stop codons 

in-frame with the ATG, and a termination codon, that is, a continuous open frame is 
30 provided between the start and the stop codon. The sequence contained between 

the start and the stop codon will comprise a sequence capable of hybridizing under 
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stringent conditions to a sequence set for in SEQ ID NO: 1-999, and may comprise 
the sequence set forth in the Seqiist. 

Other nucleic acid compositions contemplated by and within the scope of the 
present invention will be readily apparent to one of ordinary skill in the art when 
5 provided with the disclosure here. 

The invention features nucleic acids that are derived from Arabidopsis 
thaliana. Novel nucleic acid compositions of the invention of particular interest 
comprise a sequence set forth in any one of SEQ ID NOS: 1-999 or an identifying 
sequence thereof. An "identifying sequence" is a contiguous sequence of residues at 

10 least about 10 nt to about 20 nt in length, usually at least about 50 nt to about 100 nt 
in length, that uniquely identifies a nucleic acid sequence, e.g., exhibits less than 
90%, usually less than about 80% to about 85% sequence identity to any contiguous 
nucleotide sequence of more than about 20 nt. Thus, the subject novel nucleic acid 
compositions include full length cDNAs or mRNAs that encompass an identifying 

15 sequence of contiguous nucleotides from any one of SEQ ID NOS:1-999. 

The nucleic acids of the invention also include nucleic acids having sequence 
similarity or sequence identity. Nucleic acids having sequence similarity are detected 
by hybridization under low stringency conditions, for example, at 50°C and 10XSSC 
(0.9 M NaCI/0.09 M sodium citrate) and remain bound when subjected to washing at 

20 55°C in 1XSSC. Sequence identity can be determined by hybridization under 
stringent conditions, for example, at 50°C or higher and 0.1XSSC (9 mM NaCI/0.9 
mM sodium citrate). Hybridization methods and conditions are well known in the art, 
see U.S. Patent No. 5,707,829. Nucleic acids that are substantially Identical to the 
provided nucleic acid sequences, e.g. allelic variants, genetically altered versions of 

25 the gene, etc., bind to the provided nucleic acid sequences (SEQ ID NOS:1-999) 
under stringent hybridization conditions. By using probes, particularly labeled probes 
of DNA sequences, one can isolate homologous or related genes. The source of 
homologous genes can be any species, particulariy grasses as previously described. 
Preferably, hybridization is performed using at least 15 contiguous nucleotides 

30 of at least one of SEQ ID NOS:1-999. The probe will preferentially hybridize with a 
nucleic acid or mRNA comprising the complementary sequence, allowing the 



identification and retrieval of the nucleic acids of the biological material that uniquely 
hybridize to the selected probe. Probes of more than 15 nucleotides can be used, 
e.g. probes of from about 18 nucleotides up to the entire length of the provided 
nucleic acid sequences, but 15 nucleotides generally represents sufficient sequence 
5 for unique identification. 

The nucleic acids of the invention also include naturally occurring variants of 
the nucleotide sequences, e.g. degenerate variants, allelic variants, etc. Variants of 
the nucleic acids of the invention are identified by hybridization of putative variants 
with nucleotide sequences disclosed herein, preferably by hybridization under 

10 stringent conditions For example, by using appropriate wash conditions, variants of 
the nucleic acids of the invention can be identified where the allelic variant exhibits at 
most about 25-30% base pair mismatches relative to the selected nucleic acid probe. 
In general, allelic variants contain 5-25% base pair mismatches, and can contain as 
little as even 2-5%, or 1-2% base pair mismatches, as well as a single base-pair 

15 mismatch. 

The invention also encompasses homologs corresponding to the nucleic acids 
of SEQ ID NOS: 1-999, where the source of homologous genes can be any related 
species, usually within the same genus or group. Homologs have substantial 
sequence similarity, e.g. at least 75% sequence identity, usually at least 90%, more 

20 usually at least 95% between nucleotide sequences. Sequence similarity is 
calculated based on a reference sequence, which may be a subset of a larger 
sequence, such as a conserved motif, coding region, flanking region, etc. A 
reference sequence will usually be at least about 18 contiguous nt long, more usually 
at least about 30 nt long, and may extend to the complete sequence that is being 

25 compared. Algorithms for sequence analysis are known in the art, such as BLAST, 
described in Altschul et al., J. Mol. Biol. (1990) 215:403-10. 

In general, variants of the invention have a sequence identity greater than at 
least about 65%, preferably at least about 75%, more preferably at least about 85%, 
and can be greater than at least about 90% or more as determined by the Smith- 

30 Waterman homology search algorithm as implemented in MPSRCH program (Oxford 
Molecular). For the purposes of this invention, a preferred method of calculating 



percent identity is the Smith-Waterman algorithm, using the following. Global DNA 
sequence identity must be greater than 65% as determined by the Smith-Wateman 
homology search algorithm as implemented in MPSRCH program (Oxford Molecular) 
using an affine gap search with the following search parameters: gap open penalty, 
5 12; and gap extention penalty, 1 . 

The subject nucleic acids can be cDNAs or genomic DMAs, as well as 
fragments thereof, particularly fragments that encode a biologically active gene 
product and/or are useful in the methods disclosed herein. The term "cDNA" as used 
herein is intended to include all nucleic acids that share the arrangement of 
10 sequence elements found in native mature mRNA species, where sequence 
elements are exons and 3' and 5' non-coding regions. Normally mRNA species have 
contiguous exons, with the introns, when present, being removed by nuclear RNA 
splicing, to create a continuous open reading frame encoding a polypeptide of the 
invention. 

1 5 A genomic sequence of interest comprises the nucleic acid present between 

the Initiation codon and the stop codon, as defined in the listed sequences, including 
all of the introns that are normally present in a native chromosome. It can further 
include the 3' and 5' untranslated regions found in the mature mRNA. It can further 
include specific transcriptional and translational regulatory sequences, such as 

20 promoters, enhancers, etc., including about 1 kb, but possibly more, of flanking 
genomic DNA at either the 5' and 3' end of the transcribed region. The genomic 
DNA can be isolated as a fragment of 100 kb or smaller; and substantially free of 
flanking chromosomal sequence. The genomic DNA flanking the coding region, 
either 3' and 5', or Internal regulatory sequences as sometimes found in introns, 

25 contains sequences required for expression. 

The nucleic acid compositions of the subject invention can encode all or a part 
of the subject expressed polypeptides. Double or single stranded fragments can be 
obtained from the DNA sequence by chemically synthesizing oligonucleotides in 
accordance with conventional methods, by restriction enzyme digestion, by PGR 

30 amplification, etc. Isolated nucleic acids and nucleic acid fragments of the invention 
comprise at least about 15 up to about 100 contiguous nucleotides, or up to the 



complete sequence provided in SEQ ID NOS:1-999. For the most part, fragments 
will be of at least 15 nt, usually at least 18 nt or 25 nt, and up to at least about 50 
contiguous nt in length or more. 

Probes specific to the nucleic acids of the Invention can be generated using 
5 the nucleic acid sequences disclosed in SEQ ID NOS:1-999 and the fragments as 
described above. The probes can be synthesized chemically or can be generated 
from longer nucleic acids using restriction enzymes. The probes can be labeled, for 
example, with a radioactive, biotinylated, or fluorescent tag. Preferably, probes are 
designed based upon an Identifying sequence of a nucleic acid of one of SEQ ID 

10 NOS: 1-999. More preferably, probes are designed based on a contiguous sequence 
of one of the subject nucleic acids that remain unmasked following application of a 
masking program for masking low complexity (e.g., XBLAST) to the sequence., i.e. 
one would select an unmasked region, as indicated by the nucleic acids outside the 
poly-n stretches of the masked sequence produced by the masking program. 

15 The nucleic acids of the subject invention are isolated and obtained in 

substantial purity, generally as other than an intact chromosome. Usually, the nucleic 
acids, either as DNA or RNA, will be obtained substantially free of other naturally- 
occurring nucleic acid sequences, generally being at least about 50%, usually at 
least about 90% pure and are typically "recombinant", e.g., flanked by one or more 

20 nucleotides with which it is not normally associated on a naturally occurring 
chromosome. 

The nucleic acids of the invention can be provided as a linear molecule or 
within a circular molecule. They can be provided within autonomously replicating 
molecules (vectors) or within molecules without replication sequences. They can be 

25 regulated by their own or by other regulatory sequences, as is known in the art. The 
nucleic acids of the invention can be Introduced Into suitable host cells using a 
variety of techniques which are available In the art, such as transferrin polycation- 
mediated DNA transfer, transfectlon with naked or encapsulated nucleic acids, 
liposome-mediated DNA transfer, intracellular transportation of DNA-coated latex 

30 beads, protoplast fusion, viral Infection, electroporation, gene gun, calcium 
phosphate-mediated transfectlon, and the like. 
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The subject nucleic acid compositions can be used to, for example, produce 
polypeptides, as probes for the detection of mRNA of the invention in biological 
samples, e.g. extracts of cells, to generate additional copies of the nucleic acids, to 
generate ribozymes or antisense oligonucleotides, and as single stranded DNA 
5 probes or as triple-strand forming oligonucleotides. The probes described herein can 
be used to, for example, determine the presence or absence of the nucleic acid 
sequences as shown in SEQ ID NOS: 1-999 or variants thereof in a sample. These 
and other uses are described in more detail below. 

1 0 Use of Nucleic acids as Coding Sequences 

Naturally occurring Arabidopsis polypeptides or fragments thereof are 
encoded by the provided nucleic acids. Methods are known in the art to determine 
whether the complete native protein is encoded by a candidate nucleic acid 
sequence. Where the provided sequence encodes a fragment of a polypeptide, 

15 methods known in the art may be used to determine the remaining sequence. These 
approaches may utilize a bioinformatlcs approach, a cloning approach, extension of 
mRNA species, efc. 

Substantial genomic sequence is available for Arabidopsis, and may be 
exploited for determining the complete coding sequence corresponding to the 

20 provided sequences. The region of the chromosome to which a given sequence is 
located may be determined by hybridization or by database searching. The genomic 
sequence is then searched upstream and downstream for the presence of 
intron/exon boundaries, and for motifs characteristic of transcriptional start and stop 
sequences, for example by using Genscan (Burge and Karlin (1997) J. Mol. Biol . 

25 268:78-94); or GRAIL (Uberbacher and Mural (1991) P.N.A.S. 88:11261-1265). 

Alternatively, nucleic acid having a sequence of one of SEQ ID NOS:1-999, or 
an identifying fragment thereof, is used as a hybridization probe to complementary 
molecules in a cDNA library using probe design methods, cloning methods, and 
clone selection techniques as known in the art. Libraries of cDNA are made from 

30 selected cells. The cells may be those of A. thaliana, or of related species. In some 
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cases it will be desirable to select cells from a particular stage, e.g. seeds, leaves, 
Infected cells, etc. 

Techniques for producing and probing nucleic acid sequence libraries are 
described, for example, in Sambrook et al., Molecular Cloning: A Laboratory Manual, 
5 Z""" Ed., (1989) Cold Spring Harbor Press, Cold Spring Harbor, NY; and Current 
Protocols in Molecular Biology, (1987 and updates) Ausubel et al., eds. The cDNA 
can be prepared by using primers based on sequence from SEQ ID NOS: 1-999. In 
one embodiment, the cDNA library can be made from only poly-adenylated mRNA. 
Thus, poly-T primers can be used to prepare cDNA from the mRNA. 

10 Members of the library that are larger than the provided nucleic acids, and 

preferably that encompass the complete coding sequence of the native message, are 
obtained. In order to confirm that the entire cDNA has been obtained, RNA 
protection experiments are performed as follows. Hybridization of a full-length cDNA 
to an mRNA will protect the RNA from RNase degradation. If the cDNA is not full 

15 length, then the portions of the mRNA that are not hybridized will be subject to 
RNase degradation. This is assayed, as is known in the art, by changes in 
electrophoretic mobility on polyacrylamide gels, or by detection of released 
monoribonucleotides. Sambrook et al., Molecular Cloning: A Laboratory Manual, 2"^ 
Ed., (1989) Cold Spring Harbor Press, Cold Spring Harbor, NY. In order to obtain 

20 additional sequences 5' to the end of a partial cDNA, 5' RACE (PCR Protocols: A 
Guide to Methods and Applications, (1990) Academic Press, Inc.) may be performed. 

Genomic DNA is isolated using the provided nucleic acids in a manner similar 
to the isolation of full-length cDNAs. Briefly, the provided nucleic acids, or portions 
thereof, are used as probes to libraries of genomic DNA. Preferably, the library is 

25 obtained from the cell type that was used to generate the nucleic acids of the 
invention, but this is not essential. Such libraries can be in vectors suitable for 
carrying large segments of a genome, such as PI or YAC, as described in detail in 
Sambrook et al., 9.4-9.30. In order to obtain additional 5' or 3' sequences, 
chromosome walking is perfomned, as described in Sambrook et al., such that 

30 adjacent and overiapping fragments of genomic DNA are isolated. These are 
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mapped and pieced together, as is l<nown in tlie art, using restriction digestion 
enzymes and DNA ligase. 

PGR methods may be used to amplify the members of a cDNA library that 
comprise the desired insert. In this case, the desired insert will contain sequence 
5 from the full length cDNA that corresponds to the instant nucleic acids. Such PGR 
methods include gene trapping and RAGE methods. Gene trapping entails inserting 
a member of a cDNA library into a vector. The vector then is denatured to produce 
single stranded molecules. Next, a substrate-bound probe, such a biotinylated oligo, 
is used to trap cDNA inserts of interest. Biotinylated probes can be linked to an 

10 avidin-bound solid substrate. PGR methods can be used to amplify the trapped 
cDNA. To trap sequences corresponding to the full length genes, the labeled probe 
sequence is based on the nucleic acid sequences of the invention. Random primers 
or primers specific to the library vector can be used to amplify the trapped cDNA. 
Such gene trapping techniques are described in Gruber et ai, WO 95/04745 and 

15 Gruber et ai, U.S. Pat. No. 5,500,356. Kits are commercially available to perform 
gene trapping experiments from, for example, Life Technologies, Gaithersburg, 
Maryland, USA. 

"Rapid amplification of cDNA ends", or RAGE, is a PGR method of amplifying 
cDNAs from a number of different RNAs. The cDNAs are ligated to an 

20 oligonucleotide linker, and amplified by PGR using two primers. One primer is based 
on sequence from the instant nucleic acids, for which full length sequence is desired, 
and a second primer comprises sequence that hybridizes to the oligonucleotide linker 
to amplify the cDNA. A description of this methods is reported in WO 97/191 10. A 
common primer may be designed to anneal to an arbitrary adaptor sequence ligated 

25 to cDNA ends. When a single gene-specific RAGE primer is paired with the common 
primer, preferential amplification of sequences between the single gene specific 
primer and the common primer occurs. Gommercial cDNA pools modified for use in 
RAGE are available. 

Once the full-length cDNA or gene is obtained, DNA encoding variants can be 

30 prepared by site-directed mutagenesis, described in detail in Sambrook et al., 15.3- 
15.63. The choice of codon or nucleotide to be replaced can be based on disclosure 



herein on optional changes in amino acids to achieve altered protein structure and/or 
function. As an alternative method to obtaining DNA or RNA from a biological 
material, nucleic acid comprising nucleotides having the sequence of one or more 
nucleic acids of the invention can be synthesized. 

5 

Expression of Polypeptides 
The provided nucleic acid, e.g. a nucleic acid having a sequence of one of 
SEQ ID NOS: 1-999), the corresponding cDNA, the polypeptide coding sequence as 
described above, or the full-length gene is used to express a partial or complete gene 
10 product. Constructs of nucleic acids having sequences of SEQ ID NOS:1-999 can be 
generated by recombinant methods, synthetically, or in a single-step assembly of a 
gene and entire plasmid from large numbers of oligodeoxyribonucleotides is 
described by, e.g. Stemmer etaL, Gene (Amsterdam) (1995) 164(1):49-53. 

Appropriate nucleic acid constructs are purified using standard recombinant 
15 DNA techniques as described in, for example, Sambrook et al.. Molecular Cloning: A 
Laboratory Manual, 2""^ Ed., (1989) Cold Spring Harbor Press, Cold Spring Harbor, 
NY. The gene product encoded by a nucleic acid of the invention is expressed in any 
expression system, including, for example, bacterial, yeast, insect, amphibian and 
mammalian systems. 

20 The subject nucleic acid molecules are generally propagated by placing the 

molecule in a vector. Viral and non-viral vectors are used, including plasmids. The 
choice of plasmid will depend on the type of cell in which propagation is desired and 
the purpose of propagation. Certain vectors are useful for amplifying and making 
large amounts of the desired DNA sequence. Other vectors are suitable for 

25 expression in cells in culture. Still other vectors are suitable for transfer and 
expression in cells in a whole organism or person. The choice of appropriate vector is 
well within the skill of the art. Many such vectors are available commercially. 

The nucleic acids set forth in SEQ ID NOS: 1-999 or their corresponding full- 
length nucleic acids are linked to regulatory sequences as appropriate to obtain the 

30 desired expression properties. These can include promoters attached either at the 5' 
end of the sense strand or at the 3' end of the antisense strand, enhancers. 
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terminators, operators, repressors, and inducers. The promoters can be regulated or 
constitutive. In some situations it may be desirable to use conditionally active 
promoters, such as tissue-specific or developmental stage-specific promoters. These 
are linked to the desired nucleotide sequence using the techniques described above 
5 for linkage to vectors. Any techniques known in the art can be used. 

When any of the above host cells, or other appropriate host cells or 
organisms, are used to replicate and/or express the nucleic acids or nucleic acids of 
the invention, the resulting replicated nucleic acid, RNA, expressed protein or 
polypeptide, is within the scope of the invention as a product of the host cell or 
10 organism. The product is recovered by any appropriate means known in the art. 

Identification of Functional and Structural Motifs 
Translations of the nucleotide sequence of the provided nucleic acids, cDNAs 
or full genes can be aligned with individual known sequences. Similarity with 

15 individual sequences can be used to determine the activity of the polypeptides 
encoded by the nucleic acids of the invention. Also, sequences exhibiting similarity 
with more than one individual sequence can exhibit activities that are characteristic of 
either or both individual sequences. 

The six possible reading frames may be translated using programs such as 

20 GCG pepdata, or GCG Frames (Wisconsin Package Version 10.0, Genetics 
Computer Group (GCG) , Madison, Wisconsin, USA. ). Programs such as 
ORFFinder (National Center for Biotechnology Information (NCBI) a division of the 
National Library of Medicine (NLM) at the National Institutes of Health (NIH) 
http://www.ncbi.nlm.nih.gov/) may be used to identify open reading frames (ORFs) in 

25 sequences. ORF finder identifies all possible ORFs in a DNA sequence by locating 
the standard and alternative stop and start codons. Other ORF identification 
programs include Genie (Kulp et al. (1996). 

A generalized Hidden Markov Model may be used for the recognition of genes 
in DNA. (ISMB-96, St Louis, MO, AAAI/MIT Press; Reese et ai (1997), "Improved 

30 splice site detection in Genie". Proceedings of the First Annual International 
Conference on Computational Molecular Biology RECOMB 1997, Santa Fe, NM, 
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ACM Press, New York., P. 34.); BESTORF -Prediction of potential coding fragment 
in human or plant EST/mRNA sequence data using IVIarkov Chain Models; and 
FGENEP - Multiple genes structure prediction in plant genomic DNA (Solovyev et aL 
(1995) Identification of human gene structure using linear discriminant functions and 
5 dynamic programming, in Proceedings of the Third International Conference on 
Intelligent Systems for Molecular Biology eds. Rawling et aL Cambridge, England, 
AAAI Press,367-375.; Solovyev et aL (1994) Nucl. Acids Res. 22(24):51 56-51 63; 
Solovyev et al,. The prediction of human exons by oligonucleotide composition and 
discriminant analysis of spliceable open reading frames, in: The Second International 

10 conference on Intelligent systems for Molecular Biology (eds. Aitman et aL), AAAI 
Press, Menio Park, CA (1994, 354-362) Solovyev and Lawrence, Prediction of 
human gene structure using dynamic programming and oligonucleotide composition. 
In: Abstracts of the 4th annual Keck symposium. Pittsburgh, 47,1993; Burge and 
Karlin (1997) J. Mol. Biol . 268:78-94; Kulp et al. (1996) Proc. Conf. on Intelligent 

1 5 Systems in Molecular Biology '96, 1 34-1 42). 

The full length sequences and fragments of the nucleic acid sequences of the 
nearest neighbors can be used as probes and primers to identify and isolate the full 
length sequence corresponding to provided nucleic acids. Typically, a selected 
nucleic acid is translated in all six frames to determine the best alignment with the 

20 individual sequences. These amino acid sequences are referred to, generally, as 
query sequences, which are aligned with the individual sequences. Suitable 
databases include Genbank, EMBL, and DNA Database of Japan (DDBJ). 

Query and individual sequences can be aligned using the methods and 
computer programs described above, and include BLAST, available by ftp at 

25 ftp://ncbi.nlm. nih.gov/ . 

Gapped BLAST and PSI-BLAST are useful search tools provided by NCBI. 
(version 2.0) (Altschul et aL, 1997). Position-Specific Iterated BLAST (PSI-BLAST) 
provides an automated, easy-to-use version of a "profile" search, which is a sensitive 
way to look for sequence homologues. The program first performs a gapped BLAST 

30 database search. The PSI-BLAST program uses the information from any significant 
alignments returned to construct a position-specific score matrix, which replaces the 
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query sequence for the next round of database searching. PSI-BLAST may be 
iterated until no new significant alignments are found. The Gapped BLAST algorithm 
allows gaps (deletions and insertions) to be introduced into the alignments that are 
returned. Allowing gaps means that similar regions are not broken into several 
5 segments. The scoring of these gapped alignments tends to reflect biological 
relationships more closely. The Smith-Waterman is another algorithm that produces 
local or global gapped sequence alignments, see Meth. Mol. Biol. (1997) 70: 173- 
187. Also, the GAP program using the Needleman and Wunsch global alignment 
method can be utilized for sequence alignments. 

10 Results of individual and query sequence alignments can be divided into three 

categories, high similarity, weak similarity, and no similarity. Individual alignment 
results ranging from high similarity to weak similarity provide a basis for determining 
polypeptide activity and/or structure. Parameters for categorizing individual results 
include: percentage of the alignment region length where the strongest alignment is 

15 found, percent sequence identity, and e value. 

The percentage of the alignment region length is calculated by counting the 
number of residues of the individual sequence found in the region of strongest 
alignment, e.g. contiguous region of the individual sequence that contains the 
greatest number of residues that are identical to the residues of the corresponding 

20 region of the aligned query sequence. This number is divided by the total residue 
length of the query sequence to calculate a percentage. For example, a query 
sequence of 20 amino acid residues might be aligned with a 20 amino acid region of 
an individual sequence. The individual sequence might be identical to amino acid 
residues 5, 9-15, and 17-19 of the query sequence. The region of strongest 

25 alignment is thus the region stretching from residue 9-19, an 11 amino acid stretch. 
The percentage of the alignment region length is: 11 (length of the region of 
strongest alignment) divided by (query sequence length) 20 or 55%. 

Percent sequence identity is calculated by counting the number of amino acid 
matches between the query and individual sequence and dividing total number of 

30 matches by the number of residues of the individual sequences found in the region of 
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strongest alignment. Thus, the percent identity in the example above would be 10 
matches divided by 1 1 amino acids, or approximately, 90.9% 

E value is the probability that the alignment was produced by chance. For a 
single alignment, the e value can be calculated according to Karlin et al., Proc. Natl. 
5 Acad. Sci. (1990) 87:2264 and Karlin et a!., Proc. Natl. Acad. Sci. (1993) 90. The e 
value of multiple alignments using the same query sequence can be calculated using 
an heuristic approach described in Altschul et a!., Nat. Genet, (1994) 6:119. 
Alignment programs such as BLAST program can calculate the e value. 

Another factor to consider for determining identity or similarity is the location of 

10 the similarity or identity. Strong local alignment can indicate similarity even if the 
length of alignment is short. Sequence identity scattered throughout the length of the 
query sequence also can indicate a similarity between the query and profile 
sequences. The boundaries of the region where the sequences align can be 
determined according to Doolittle, supra; BLAST or FASTA programs; or by 

15 determining the area where sequence identity is highest. 

In general, in alignment results considered to be of high similarity, the percent 
of the alignment region length is typically at least about 55% of total length query 
sequence; more typically, at least about 58%; even more typically; at least about 60% 
of the total residue length of the query sequence. Usually, percent length of the 

20 alignment region can be as much as about 62%; more usually, as much as about 
64%; even more usually, as much as about 66%. Further, for high similarity, the 
region of alignment, typically, exhibits at least about 75% of sequence identity; more 
typically, at least about 78%; even more typically; at least about 80% sequence 
identity. Usually, percent sequence identity can be as much as about 82%; more 

25 usually, as much as about 84%; even more usually, as much as about 86%. 

The p value is used in conjunction with these methods. The query sequence 
is considered to have a high similarity with a profile sequence when the p value is 
less than or equal to 10"^. Confidence in the degree of similarity between the query 
sequence and the profile sequence increases as the p value become smaller. 

30 In general, where alignment results considered to be of weak similarity, there 

is no minimum percent length of the alignment region nor minimum length of 
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alignment. A better showing of weak similarity is considered when the region of 
alignment is, typically, at least about 15 amino acid residues in length; more typically, 
at least about 20; even more typically; at least about 25 amino acid residues in 
length. Usually, length of the alignment region can be as much as about 30 amino 
5 acid residues; more usually, as much as about 40; even more usually, as much as 
about 60 amino acid residues. Further, for weak similarity, the region of alignment, 
typically, exhibits at least about 35% of sequence identity; more typically, at least 
about 40%; even more typically; at least about 45% sequence identity. Usually, 
percent sequence identity can be as much as about 50%; more usually, as much as 
10 about 55%; even more usually, as much as about 60%. 

The query sequence is considered to have a low similarity with a profile 
sequence when the p value is greater than 10"^. Confidence in the degree of 
similarity between the query sequence and the profile sequence decreases as the p 
values become larger. 

15 Sequence identity alone can be used to determine similarity of a query 

sequence to an individual sequence and can indicate the activity of the sequence. 
Such an alignment, preferably, permits gaps to align sequences. Typically, the query 
sequence is related to the profile sequence if the sequence identity over the entire 
query sequence is at least about 15%; more typically, at least about 20%; even more 

20 typically, at least about 25%; even more typically, at least about 50%. Sequence 
identity alone as a measure of similarity is most useful when the query sequence is 
usually, at least 80 residues in length; more usually, 90 residues; even more usually, 
at least 95 amino acid residues in length. More typically, similarity can be concluded 
based on sequence identity alone when the query sequence is preferably 100 

25 residues in length; more preferably, 120 residues in length; even more preferably, 
150 amino acid residues in length. 

It is apparent, when studying protein sequence families, that some regions 
have been better conserved than others during evolution. These regions are 
generally important for the function of a protein and/or for the maintenance of its 

30 three- dimensional structure. By analyzing the constant and variable properties of 
such groups of similar sequences, it is possible to derive a signature for a protein 
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family or domain, whicli distinguishes its members from all other unrelated proteins. 
A pertinent analogy is the use of fingerprints by the police for identification purposes. 
A fingerprint is generally sufficient to identify a given individual. Similarly, a protein 
signature can be used to assign a new sequence to a specific family of proteins and 

5 thus to formulate hypotheses about its function. The PROSITE database is a 
compendium of such fingerprints (motifs) and may be used with search software such 
as Wisconsin GCG Motifs to find motifs or fingerprints in query sequences. 
PROSITE currently contains signatures specific for about a thousand protein families 
or domains. Each of these signatures comes with documentation providing 

10 background information on the structure and function of these proteins (Hofmann et 
a/. (1999) Nucleic Acids Res. 27:215-219; Bucher and Bairoch ., A generalized profile 
syntax for biomolecular sequences motifs and its function in automatic sequence 
interpretation (In) ISMB-94; Proceedings 2nd International Conference on Intelligent 
Systems for Molecular Biology; Altman et ai Eds. (1994), pp 53-61, AAAI Press, 

15 Menio Park). 

Translations of the provided nucleic acids can be aligned with amino acid 
profiles that define either protein families or common motifs. Also, translations of the 
provided nucleic acids can be aligned to multiple sequence alignments (MSA) 
comprising the polypeptide sequences of members of protein families or motifs. 

20 Similarity or identity with profile sequences or MSAs can be used to determine the 
activity of the gene products (e.g., polypeptides) encoded by the provided nucleic 
acids or corresponding cDNA or genes. 

Profiles can designed manually by (1) creating an MSA, which is an alignment 
of the amino acid sequence of members that belong to the family and (2) constructing 

25 a statistical representation of the alignment. Such methods are described, for 
example, in Birney et al., Nucl. Acid Res. (1996) 24(14): 2730-2739. MSAs of some 
protein families and motifs are available for downloading to a local server. For 
example, the PFAM database with MSAs of 547 different families and motifs, and the 
software (HMMER) to search the PFAM database may be downloaded from 

30 ftp://ftp.genetics.wustl.edU/pub/eddy/pfam-4.4/ to allow secure searches on a local 
server. Pfam is a database of multiple alignments of protein domains or conserved 
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protein regions., wliich represent evolutionary conserved structure that lias 
implications for the protein's function (Sonnhammer et al. (1998) Nucl. Acid Res. 
26:320-322; Bateman etal. (1999) Nucleic Acids Res . 27:260-262). 

The 3D_ali databank (Pasarella, S. and Argos, P. (1992) Prot. Engineering 

5 5:121-137) was constructed to incorporate new protein structural and sequence data. 
The databank has proved useful in many research fields such as protein sequence 
and stnjcture analysis and comparison, protein folding, engineering and design and 
evolution. The collection enhances present protein structural knowledge by merging 
information from proteins of similar main-chain fold with homologous primary 

10 structures taken from large databases of all known sequences. 3D_ali databank files 
may be downloaded to a secure local server from http://www.embl- 
heidelberg.de/argos/ali/ali_form.html. 

The identify and function of the gene that correlates to a nucleic acid 
described herein can be determined by screening the nucleic acids or their 

15 corresponding amino acid sequences against profiles of protein families. Such 
profiles focus on common structural motifs among proteins of each family. Publicly 
available profiles are known in the art. 

In comparing a novel nucleic acid with known sequences, several alignment 
tools are available. Examples include PileUp, which creates a multiple sequence 

20 alignment, and is described in Feng et al., J. Mol. Evol. (1987) 25:351. Another 
method, GAP, uses the alignment method of Needleman et al., J. Mol. Biol. (1970) 
48:443. GAP is best suited for global alignment of sequences. A third method, 
BestFit, functions by inserting gaps to maximize the number of matches using the 
local homology algorithm of Smith et al. (1981 ) Adv. Appl. Math . 2:482. 

25 

Identification of Secreted & Membrane-Bound Polypeptides 
Secreted and membrane-bound polypeptides of the present invention are of 
interest. Because both secreted and membrane-bound polypeptides comprise a 
fragment of contiguous hydrophobic amino acids, hydrophobicity predicting 
30 algorithms can be used to identify such polypeptides. A signal sequence is usually 
encoded by both secreted and membrane-bound polypeptide genes to direct a 
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polypeptide to the surface of the cell. The signal sequence usually comprises a 
stretch of hydrophobic residues. Such signal sequences can fold into helical 
structures. Membrane-bound polypeptides typically comprise at least one 
transmembrane region that possesses a stretch of hydrophobic amino acids that can 

5 transverse the membrane. Some transmembrane regions also exhibit a helical 
structure. Hydrophobic fragments within a polypeptide can be identified by using 
computer algorithms. Such algorithms include Hopp & Woods, Proc. Natl, Acad. Sci. 
USA (1981) 78:3824-3828; Kyte & Doolittle, J. Mol. Biol. (1982) 157: 105-132; and 
RAOAR algorithm, Degli Esposti et al., Eur. J. Biochem. (1990) 190: 207-219. 

10 Another method of identifying secreted and membrane-bound polypeptides is 

to translate the nucleic acids of the invention in all six frames and determine if at least 
8 contiguous hydrophobic amino acids are present. Those translated polypeptides 
with at least 8; more typically, 10; even more typically, 12 contiguous hydrophobic 
amino acids are considered to be either a putative secreted or membrane bound 

15 polypeptide. Hydrophobic amino acids include alanine, glycine, histidine, isoleucine, 
leucine, lysine, methionine, phenylalanine, proline, threonine, tryptophan, tyrosine, 
and valine. 

Identification of the Function of an Expression Product 
20 The biological function of the encoded gene product of the invention may be 

determined by empirical or deductive methods. One promising avenue, termed 
phylogenomics, exploits the use of evolutionary information to facilitate assignment of 
gene function. The approach is based on the idea that functional predictions can be 
greatly improved by focusing on how genes became similar in sequence during 
25 evolution instead of focusing on the sequence similarity itself. One of the major 
efficiencies that has emerged from plant genome research to date is that a large 
percentage of higher plant genes can be assigned some degree of function by 
comparing them with the sequences of genes of known function. 

Alternatively, "reverse genetics" is used to identify gene function. Large 
30 collections of insertion mutants are available for Arabidopsis, maize, petunia, and 
snapdragon. These collections can be screened for an insertional inactivation of any 
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gene by using the polymerase chain reaction (PGR) primed with oligonucleotides 
based on the sequences of the target gene and the Insertional mutagen. The 
presence of an insertion in the target gene is indicated by the presence of a PGR 
product. By multiplexing DNA samples, hundreds of thousands of lines can be 

5 screened and the con-esponding mutant plants can be identified with relatively small 
effort. Analysis of the phenotype and other properties of the corresponding mutant 
will provide an insight into the function of the gene. 

In one method of the invention, the gene function in a transgenic Arabidopsis 
plant is assessed with anti-sense constructs. A high degree of gene duplication is 

10 apparent in Arabidopsis, andmany of the gene duplications in Arabidopsis are very 
tightly linked. Large numbers of transgenic Arabidopsis plants can be generated by 
infecting flowers with Agrobacterium tumefaciens containing an insertional mutagen, 
a method of gene silencing based on producing double-stranded RNA from 
bidirectional transcription of genes in transgenic plants can be broadly useful for high- 

15 throughput gene inactivation (Clough and Bent (1999) Plant J . 17; Waterhouse et al. 
(1998) Proc. Natl. Acad. Sci. U.S.A . 95:13959). This method may use promoters that 
are expressed in only a few cell types or at a particular developmental stage or in 
response to an external stimulus. This could significantly obviate problems 
associated with the lethality of some mutations. 

20 Virus-induced gene silencing may also find use for suppressing gene function. 

This method exploits the fact that some or all plants have a surveillance system that 
can specifically recognize viral nucleic acids and mount a sequence-specific 
suppression of viral RNA accumulation. By inoculating plants with a recombinant 
virus containing part of a plant gene, it is possible to rapidly silence the endogenous 

25 plant gene. 

Antisense nucleic acids are designed to specifically bind to RNA, resulting in 
the formation of RNA-DNA or RNA-RNA hybrids, with an arrest of DNA replication, 
reverse transcription or messenger RNA translation. Antisense nucleic acids based 
on a selected nucleic acid sequence can interfere with expression of the 
30 corresponding gene. Antisense nucleic acids are typically generated within the cell 
by expression from antisense constructs that contain the antisense strand as the 
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transcribed strand. Antisense nucleic acids based on the disclosed nucleic acids will 
bind and/or interfere with the translation of mRNA comprising a sequence 
complementary to the antisense nucleic acid. The expression products of control 
cells and cells treated with the antisense construct are compared to detect the protein 
5 product of the gene corresponding to the nucleic acid upon which the antisense 
construct is based. The protein is isolated and identified using routine biochemical 
methods. 

As an alternative method for identifying function of the gene corresponding to 
a nucleic acid disclosed herein, dominant negative mutations are readily generated 

10 for con-esponding proteins that are active as homomultimers. A mutant polypeptide 
will interact with wild-type polypeptides (made from the other allele) and form a non- 
functional multimer. Thus, a mutation is in a substrate-binding domain, a catalytic 
domain, or a cellular localization domain. Preferably, the mutant polypeptide will be 
overproduced. Point mutations are made that have such an effect. In addition, 

15 fusion of different polypeptides of various lengths to the terminus of a protein can 
yield dominant negative mutants. General strategies are available for making 
dominant negative mutants (see for example, Herskowitz (1987) Nature 329:219). 
Such techniques can be used to create loss of function mutations, which are useful 
for determining protein function. 

20 Another approach for discovering the function of genes utilizes gene chips and 

microan-ays. DNA sequences representing all the genes in an organism can be 
placed on miniature solid supports and used as hybridization substrates to quantitate 
the expression of all the genes represented in a complex mRNA sample. This 
information is used to provide extensive databases of quantitative information about 

25 the degree to which each gene responds to pathogens, pests, drought, cold, salt, 
photoperiod, and other environmental variation. Similarly, one obtains extensive 
information about which genes respond to changes in developmental processes such 
as germination and flowering. One can therefore determine which genes respond to 
the phytohomriones, growth regulators, safeners, herbicides, and related 

30 agrichemicals. These databases of gene expression information provide insights into 
the "pathways" of genes that control complex responses. The accumulation of DNA 
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microarray or gene chip data from many different experiments creates a powerful 
opportunity to assign functional information to genes of othenA/ise unknown function. 
The conceptual basis of the approach is that genes that contribute to the same 
biological process will exhibit similar patterns of expression. Thus, by clustering 
5 genes based on the similarity of their relative levels of expression in response to 
diverse stimuli or developmental or environmental conditions, it is possible to assign 
functions to many genes based on the known function of other genes in the cluster. 

Construction of Polypeptides of the Invention and Variants Thereof 
10 The polypeptides of the invention include those encoded by the disclosed 

nucleic acids. These polypeptides can also be encoded by nucleic acids that, by 
virtue of the degeneracy of the genetic code, are not identical in sequence to the 
disclosed nucleic acids. Thus, the invention includes within its scope a polypeptide 
encoded by a nucleic acid having the sequence of any one of SEQ ID NOS: 1-999 or 
15 a variant thereof. 

In general, the term "polypeptide" as used herein refers to both the full length 
polypeptide encoded by the recited nucleic acid, the polypeptide encoded by the 
gene represented by the recited nucleic acid, as well as portions or fragments 
thereof. "Polypeptides" also includes variants of the naturally occurring proteins, 
20 where such variants are homologous or substantially similar to the naturally occurring 
protein, and can be of an origin of the same or different species as the naturally 
occurring protein. In general, variant polypeptides have a sequence that has at least 
about 80%, usually at least about 90%, and more usually at least about 98% 
sequence identity with a differentially expressed polypeptide of the invention, as 
25 measured by BLAST using the parameters described above. The variant 
polypeptides can be naturally or non-naturally glycosylated, i.e., the polypeptide has 
a glycosylation pattern that differs from the glycosylation pattern found in the 
corresponding naturally occurring protein. 

In general, the polypeptides of the subject invention are provided in a non- 
30 naturally occurring environment, e.g. are separated from their naturally occurring 
environment. In certain embodiments, the subject protein is present in a composition 
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that is enriched for the protein as compared to a control. As such, purified 
polypeptide is provided, where by purified is meant that the protein is present in a 
composition that is substantially free of non-differentially expressed polypeptides, 
where by substantially free is meant that less than 90%, usually less than 60% and 
5 more usually less than 50% of the composition is made up of non-differentially 
expressed polypeptides. 

Also within the scope of the invention are variants; variants of polypeptides 
include mutants, fragments, and fusions. Mutants can include amino acid 
substitutions, additions or deletions. The amino acid substitutions can be 

10 conservative amino acid substitutions or substitutions to eliminate non-essential 
amino acids, such as to alter a glycosylation site, a phosphorylation site or an 
acetylation site, or to minimize misfolding by substitution or deletion of one or more 
cysteine residues that are not necessary for function. Conservative amino acid 
substitutions are those that preserve the general charge, 

1 5 hydrophobicity/hydrophilicity, and/or steric bulk of the amino acid substituted. 

Variants also include fragments of the polypeptides disclosed herein, 
particularly biologically active fragments and/or fragments corresponding to functional 
domains. Fragments of interest will typically be at least about 10 amino acids (aa) to 
at least about 15 aa in length, usually at least about 50 aa in length, and can be as 

20 long as 300 aa in length or longer, but will usually not exceed about 1000 aa in 
length, where the fragment will have a stretch of amino acids that is identical to a 
polypeptide encoded by a nucleic acid having a sequence of any SEQ ID N0S:1- 
999, or a homolog thereof. 

The protein variants described herein are encoded by nucleic acids that are 

25 within the scope of the invention. The genetic code can be used to select the 
appropriate codons to construct the corresponding variants. 

Libraries and Arrays 
In general, a library of biopolymers is a collection of sequence information, 
30 which information is provided in either biochemical form (e.g., as a collection of 
nucleic acid or polypeptide molecules), or in electronic form (e.g., as a collection of 
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genetic sequences stored in a computer-readable form, as in a computer system 
and/or as part of a computer program). The term biopolymer, as used herein, is 
intended to refer to polypeptides, nucleic acids, and derivatives thereof, which 
molecules are characterized by the possession of genetic sequences either 
5 corresponding to, or encoded by, the sequences set forth in the provided sequence 
list (seqiist). The sequence information can be used in a variety of ways, e.g., as a 
resource for gene discovery, as a representation of sequences expressed in a 
selected cell type, e.g. cell type markers, etc. 

The nucleic acid libraries of the subject invention include sequence information 

10 of a plurality of nucleic acid sequences, where at least one of the nucleic acids has a 
sequence of any of SEQ ID NOS: 1-999. By plurality is meant one or more, usually at 
least 2 and can include up to all of SEQ ID NOS:1-999. The length and number of 
nucleic acids in the library will vary with the nature of the library, e.g., if the library is 
an oligonucleotide array, a cDNA array, a computer database of the sequence 

15 information, etc. 

Where the library is an electronic library, the nucleic acid sequence 
information can be present in a variety of media. "Media" refers to a manufacture, 
other than an isolated nucleic acid molecule, that contains the sequence information 
of the present invention. Such a manufacture provides the sequences or a subset 

20 thereof in a form that can be examined by means not directly applicable to the 
sequence as it exists in a nucleic acid. For example, the nucleotide sequence of the 
present invention, e.g. the nucleic acid sequences of any of the nucleic acids of SEQ 
ID NOS: 1-999, can be recorded on computer readable media, e.g. any medium that 
can be read and accessed directly by a computer. Such media include, but are not 

25 limited to: magnetic storage media, such as a floppy disc, a hard disc storage 
medium, and a magnetic tape; optical storage media such as CD-ROM; electrical 
storage media such as RAM and ROM; and hybrids of these categories such as 
magnetic/optical storage media. One of skill in the art can readily appreciate how 
any of the presently known computer readable mediums can be used to create a 

30 manufacture comprising a recording of the present sequence information. 
"Recorded" refers to a process for storing information on computer readable medium, 
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using any such methods as known in the art. Any convenient data storage structure 
can be chosen, based on the means used to access the stored information. A variety 
of data processor programs and formats can be used for storage, e.g. word 
processing text file, database format, etc. In addition to the sequence information, 

5 electronic versions of the libraries of the invention can be provided in conjunction or 
connection with other computer-readable information and/or other types of computer- 
readable files (e.g., searchable files, executable files, etc, including, but not limited to, 
for example, search program software, etc.) 

By providing the nucleotide sequence in computer readable form, the 

10 information can be accessed for a variety of purposes. Computer software to access 
sequence information is publicly available. For example, the BLAST (Altschul et al., 
supra.) and BLAZE (Brutlag et al. Comp. Chem. (1993) 17:203) search algorithms on 
a Sybase system can be used identify open reading frames (ORFs) within the 
genome that contain homology to ORFs from other organisms. 

15 As used herein, "a computer-based system" refers to the hardware means, 

software means, and data storage means used to analyze the nucleotide sequence 
information of the present invention. The minimum hardware of the computer-based 
systems of the present invention comprises a central processing unit (CPU), input 
means, output means, and data storage means. A skilled artisan can readily 

20 appreciate that any one of the currently available computer-based system are 
suitable for use in the present invention. The data storage means can comprise any 
manufacture comprising a recording of the present sequence information as 
described above, or a memory access means that can access such a manufacture. 
"Search means" refers to one or more programs implemented on the 

25 computer-based system, to compare a target sequence or target structural motif with 
the stored sequence information. Search means are used to identify fragments or 
regions of the genome that match a particular target sequence or target motif. A 
variety of known algorithms are publicly known and commercially available, e.g. 
MacPattern (EMBL), BLASTN, BLASTX (NCBI) and tBLASTX. A "target sequence" 

30 can be any DNA or amino acid sequence of six or more nucleotides or two or more 
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amino acids, preferably from about 10 to 100 amino acids or from about 30 to 300 
nucleotide residues. 

A "target structural motif," or "target motif," refers to any rationally selected 
sequence or combination of sequences in which the sequence(s) are chosen based 

5 on a three-dimensional configuration that is formed upon the folding of the target 
motif, or on consensus sequences of regulatory or active sites. There are a variety of 
target motifs known in the art. Protein target motifs include, but arc not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are 
not limited to, hairpin structures, promoter sequences and other expression elements 

1 0 such as binding sites for transcription factors. 

A variety of structural formats for the input and output means can be used to 
input and output the information in the computer-based systems of the present 
invention. One format for an output means ranks fragments of the genome 
possessing varying degrees of homology to a target sequence or target motif. Such 

15 presentation provides a skilled artisan with a ranking of sequences and identifies the 
degree of sequence similarity contained in the identified fragment. 

A variety of comparing means can be used to compare a target sequence or 
target motif with the data storage means to identify sequence fragments of the 
genome. A skilled artisan can readily recognize that any one of the publicly available 

20 homology search programs can be used as the search means for the computer 
based systems of the present invention. 

As discussed above, the "library" of the invention also encompasses 
biochemical libraries of the nucleic acids of SEQ ID NOS:1-999, e.g., collections of 
nucleic acids representing the provided nucleic acids. The biochemical libraries can 

25 take a variety of forms, e.g. a solution of cDNAs, a pattern of probe nucleic acids 
stably bound to a surface of a solid support (microarray) and the like. By array is 
meant an article of manufacture that has a solid support or substrate with one or 
more nucleic acid targets on one of its surfaces, where the number of distinct nucleic 
may be in the hundreds, thousand, or tens of thousands. Each nucleic acid will 

30 comprise at 18 nt and often at least 25 nt, and often at least 100 to 1000 nucleotides, 
and may represent up to a complete coding sequence or cDNA.. A variety of 
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different array formats have been developed and are known to those of skill In the art. 
The arrays of the subject invention find use in a variety of applications. Including 
gene expression analysis, drug screening, mutation analysis and the like, as 
disclosed in the above-listed exemplary patent documents. 
5 In addition to the above nucleic acid libraries, analogous libraries of 

polypeptides are also provided, where the where the polypeptides of the library will 
represent at least a portion of the polypeptides encoded by SEQ ID NOS: 1-999. 

Genetically Altered Cells and Transgenics 

10 The subject nucleic acids can be used to create genetically modified and 

transgenic organisms, usually plant cells and plants, which may be monocots or 
dicots. The term transgenic, as used herein, is defined as an organism into which an 
exogenous nucleic acid construct has been introduced, generally the exogenous 
sequences are stably maintained in the genome of the organism. Of particular 

1 5 interest are transgenic organisms where the genomic sequence of gemri line cells has 
been stably altered by introduction of an exogenous construct. 

Typically, the transgenic organism is altered in the genetic expression of the 
introduced nucleotide sequences as compared to the wild-type, or unaltered 
organism. For example, constructs that provide for over-expression of a targeted 

20 sequence, sometimes referred to as a "knock-in", provide for increased levels of the 
gene product. Alternatively, expression of the targeted sequence can be down- 
regulated or substantially eliminated by introduction of a "knock-out" construct, which 
may direct transcription of an anti-sense RNA that blocks expression of the naturally 
occurring mRNA, by deletion of the genomic copy of the targeted sequence, etc. 

25 In one method, large numbers of genes are simultaneously introduced in order 

to explore the genetic basis of complex traits, for example by making plant artificial 
chromosome (PLAC) libraries. The centromeres in Arabidopsis have been mapped 
and current genome sequencing efforts will extend through these regions. Because 
Arabidopsis telomeres are very similar to those in yeast one may use a hybrid 

30 sequence of alternating plant and yeast sequences that function in both types of 
organisms, developing yeast artificial chromosome-PLAC libraries, and then 



30 



introducing them into a suitable plant liost to evaluate the phenotypic consequences. 
By providing a defined chromosomal environment for cloned genes, the use of 
PLACs may also enhance the ability to produce transgenic plants with defined levels 
of gene expression. 

5 It has been found in many organisms that there is significant redundancy in 

the representation of genes in a genome. That is, a particular gene function is likely 
by represented by multiple copies of similar coding sequences in the genome. These 
copies are typically conserved in the amino acid sequence, but may diverge in the 
sequence of non-translated sequences, and in their codon usage. In order to knock 

10 out a particular genetic function in an organism, it may not be sufficient to delete a 
genomic copy of a single gene. In such cases it may be preferable to achieve a 
genetic knock-out with an anti-sense construct, particularly where the sequence is 
aligned with the coding portion of the mRNA. 

Methods of transforming plant cells are well-known in the art, and include 

15 protoplast transfomiation, tungsten whiskers (Coffee et al., U.S. Pat. No. 5,302,523, 
issued Apr. 12, 1994), directly by microorganisms with infectious plasmids, use of 
transposons (U.S. Patent No. 5,792,294), infectious viruses, the use of liposomes, 
microinjection by mechanical or laser beam methods, by whole chromosomes or 
chromosome fragments, electroporation, silicon carbide fibers, and microprojectile 

20 bombardment. 

For example, one may utilize the biolistic bombardment of meristem tissue, at 
a very early stage of development, and the selective enhancement of transgenic 
sectors toward genetic homogeneity, in cell layers that contribute to germline 
transmission. Biolistics-mediated production of fertile, transgenic maize is described 

25 in Gordon-Kamm et al. (1990), Plant Cell 2:603; Fromm et al. (1990) Bio/Technology 
8: 833, for example. Alternatively, one may use a microorganism, including but not 
limited to, Agrobacterium tumefaciens as a vector for transforming the cells, 
particularly where the targeted plant is a dicotyledonous species. See, for example, 
U.S. Patent No. 5,635,381. Leung et al. (1990) Curr. Genet. 17(5):409-11 describe 

30 integrative transformation of three fertile hermaphroditic strains of Arabidopsis 
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thaliana using plasmids and cosmids that contain an £. coli gene linked to Aspergillus 
nidulans regulatory sequences. 

Preferred expression cassettes for cereals may include promoters that are 
known to express exogenous DNAs in corn cells. For example, the AdhI promoter 

5 has been shown to be strongly expressed in callus tissue, root tips, and developing 
kernels in corn. Promoters that are used to express genes in corn include, but are not 
limited to, a plant promoter such as the, CaMV 35S promoter (Odell et a!., Nature, 
313, 810 (1985)), or others such as CaMV 19S (Lawton et al., Plant Mol. Biol., 9, 31 F 
(1987)), nos (Ebert et al., PNAS USA, 84, 5745 (1987)), Adh (Walker et al., PNAS 

10 USA, 84, 6624 (1987)), sucrose synthase (Yang et al., PNAS USA, 87, 4144 (1990)), 
.alpha.-tubulin, ubiquitin, actin (Wang et al., Mol. Cell. Biol., 12, 3399 (1992)), cab 
(Sullivan et al., Mol. Gen. Genet, 215, 431 (1989)), PEPCase (Hudspeth et al., Plant 
Mol. Biol., 12, 579 (1989)), or those associated with the R gene complex (Chandler et 
al.. The Plant Cell, 1, 1175 (1989)). Other promoters useful in the practice of the 

1 5 invention are known to those of skill in the art. 

Tissue-specific promoters, including but not limited to, root-cell promoters 
(Conkling et al., Plant Physiol., 93, 1203 (1990)), and tissue-specific enhancers 
(Fromm et al.. The Plant Cell, 1, 977 (1989)) are also contemplated to be particularly 
useful, as are inducible promoters such as water-stress-, ABA- and turgor-inducible 

20 promoters (Guerrero et al.. Plant Molecular Biology, 1 5, 1 1-26)), and the like. 

Regulating and/or limiting the expression in specific tissues may be 
functionally accomplished by Introducing a constitutively expressed gene (all tissues) 
in combination with an antisense gene that is expressed only in those tissues where 
the gene product is not desired. Expression of an antisense transcript of this 

25 preselected DNA segment in an rice grain, using, for example, a zein promoter, 
would prevent accumulation of the gene product in seed. Hence the protein encoded 
by the preselected DNA would be present in all tissues except the kernel. 

Alternatively, one may wish to obtain novel tissue-specific promoter 
sequences for use in accordance with the present invention. To achieve this, one 

30 may first isolate cDNA clones from the tissue concerned and identify those clones 
which are expressed specifically in that tissue, for example, using Northern blotting or 
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DNA microarrays. Ideally, one would like to identify a gene that is not present in a 

high copy number, but which gene product is relatively abundant in specific tissues. 

The promoter and control elements of corresponding genomic clones may then be 

localized using the techniques of molecular biology known to those of skill in the art. 
5 Alternatively, promoter elements can be identified using enhancer traps based on T- 

DNA and/or transposon vector systems (see, for example, Campisi et ai (1999) Plant 

J. 17:699-707; Gu etaL (1998) Development 125:1509-1517). 

In some embodiments of the present invention expression of a DNA segment 

in a transgenic plant will occur only in a certain time period during the development of 
10 the plant Developmental timing is frequently correlated with tissue specific gene 

expression. For example, in corn expression of zein storage proteins is initiated in the 

endosperm about 15 days after pollination. 

Ultimately, the most desirable DNA segments for introduction into a plant 

genome may be homologous genes or gene families which encode a desired trait 
15 (e.g., increased disease resistance) and which are introduced under the control of 

novel promoters or enhancers, etc., or perhaps even homologous or tissue-specific 

(e.g., root-, grain- or leaf-specific) promoters or control elements. 

The genetically modified cells are screened for the presence of the introduced 

genetic material. The cells may be used in functional studies, drug screening, etc., 
20 e.g. to study chemical mode of action, to determine the effect of a candidate agent on 

pathogen growth, infection of plant cells, etc. 

The modified cells are useful in the study of genetic function and regulation, 

for alteration of the cellular metabolism, and for screening compounds that may affect 

the biological function of the gene or gene product. For example, a series of small 
25 deletions and/or substitutions may be made in the hosts native gene to determine 

the role of different domains and motifs in the biological function. Specific constructs 

of interest include anti-sense, as previously described, which will reduce or abolish 

expression, expression of dominant negative mutations, and over-expression of 

genes. 

30 Where a sequence is introduced, the introduced sequence may be either a 

complete or partial sequence of a gene native to the host, or may be a complete or 
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partial sequence that is exogenous to the host organism, e.g., an A. thaliana 
sequence inserted into wheat plants. A detectable marker, such as aldA, lac Z, etc, 
may be introduced into the locus of interest, where upregulation of expression will 
result in an easily detected change in phenotype. 
5 One may also provide for expression of the gene or variants thereof in cells or 

tissues where it is not normally expressed, at levels not normally present in such cells 
or tissues, or at abnormal times of development, during sporulation, etc. By providing 
expression of the protein in cells in which it is not normally produced, one can induce 
changes in cell behavior, 

10 DNA constructs for homologous recombination will comprise at least a portion 

of the provided gene or of a gene native to the species of the host organism, wherein 
the gene has the desired genetic modification(s), and includes regions of homology 
to the target locus (see Kempin et aL (1997) Nature 389:802-803). DNA constructs 
for random integration or episomal maintenance need not include regions of 

15 homology to mediate recombination. Conveniently, markers for positive and negative 
selection are included. Methods for generating cells having targeted gene 
modifications through homologous recombination are known in the art. 

Embodiments of the invention provide processes for enhancing or inhibiting 
synthesis of a protein in a plant by introducing a provided nucleic acids sequence into 

20 a plant cell, where the nucleic acid comprises sequences encoding a protein of 
interest. For example, enhanced resistance to pathogens may be achieved by 
inserting a nucleic acid encoding an activator in a vector downstream from a 
promoter sequence capable of driving constitutive high-level expression in a plant 
cell. When grown into plants, the transgenic plants exhibit increased synthesis of 

25 resistance proteins, and increased resistance to pathogens. 

Other embodiments of the invention provide processes for enhancing or 
inhibiting synthesis of a tolerance factor in a plant by introducing a nucleic acid of the 
invention into a plant cell, where the nucleic acid comprises sequences encoding a 
tolerance factor. For example, enhanced tolerance to an environmental stress may 

30 be achieved by inserting a nucleic acid encoding an activator in a vector downstream 
from a promoter sequence capable of driving constitutive high-level expression in a 
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plant cell. When grown into plants, the transgenic plants exhibit increased synthesis 
of tolerance proteins, and increased tolerance to environmental stress. 

Factors which are involved, directly or indirectly in biosynthetic pathways 
whose products are of commercial, nutritional, or medicinal value include any factor, 

5 usually a protein or peptide, which regulates such a biosynthetic pathway (e.g., an 
activator or repressor); which is an intermediate in such a biosynthetic pathway; or 
which is a product that increases the nutritional value of a food product; a medicinal 
product; or any product of commercial value and/or research interest. Plant and 
other cells may be genetically modified to enhance a trait of interest, by upregulating 

1 0 or down-regulating factors in a biosynthetic pathway. 

Screening Assays 

The polypeptides encoded by the provided nucleic acid sequences, and cells 
genetically altered to express such sequences, are useful in a variety of screening 

15 assays to determine effect of candidate inhibitors, activators., or modifiers of the 
gene product. One may determine what insecticides, fungicides and the like have an 
enhancing or synergistic activity with a gene. Altematively, one may screen for 
compounds that mimic the activity of the protein. Similarly, the effect of activating 
agents may be used to screen for compounds that mimic or enhance the activation of 

20 proteins. Candidate inhibitors of a particular gene product are screened by detecting 
decreased from the targeted gene product. 

The screening assays may use purified target macromolecules to screen large 
compound libraries for inhibitory drugs; or the purified target molecule may be used 
for a rational drug design program, which requires first determining the structure of 

25 the macromolecular target or the structure of the macromolecular target In 
association with its customary substrate or ligand. This information is then used to 
design compounds which must be synthesized and tested further. Test results are 
used to refine the molecular models and drug design process in an iterative fashion 
until a lead compound emerges. 

30 Drug screening may be performed using an in vitro model, a genetically 

altered cell, or purified protein. One can identify ligands or substrates that bind to. 
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modulate or mimic the action of the target genetic sequence or its product. A wide 
variety of assays may be used for this purpose, including labeled in vitro protein- 
protein binding assays, electrophoretic mobility shift assays, immunoassays for 
protein binding, and the like. The purified protein may also be used for determination 
5 of three-dimensional crystal structure, which can be used for modeling intermolecular 
interactions. 

Where the nucleic acid encodes a factor involved in a biosynthetic pathway, as 
described above, it may be desirable to identify factors, e.g., protein factors, which 
interact with such factors. One can identify interacting factors, ligands, substrates 

10 that bind to, modulate or mimic the action of the target genetic sequence or its 
product. A wide variety of assays may be used for this purpose, including labeled in 
vitro protein-protein binding assays, electrophoretic mobility shift assays, 
immunoassays for protein binding, and the like. In vivo assays for protein-protein 
interactions in E. coli and yeast cells are also well-established (see Hu et al. (2000) 

15 Methods 20:80-94: and Bai and Eliedge (1997) Methods Enzvmol . 283:141-156). 

The purified protein may also be used for determination of three-dimensional 
crystal structure, which can be used for modeling intermolecular interactions. It may 
also be of interest to identify agents that modulate the interaction of a factor identified 
as described above with a factor encoded by a nucleic acid of the invention. Dnjg 

20 screening can be performed to identify such agents. For example, a labeled in vitro 
protein-protein binding assay can be used, which is conducted in the presence and 
absence of an agent being tested. 

The term "agent" as used herein describes any molecule, e.g. protein or 
pharmaceutical, with the capability of altering or mimicking a physiological function. 

25 Generally a plurality of assay mixtures are run in parallel with different agent 
concentrations to obtain a differential response to the various concentrations. 
Typically, one of these concentrations serves as a negative control, i.e. at zero 
concentration or below the level of detection. 

Candidate agents encompass numerous chemical classes, though typically 

30 they are organic molecules, preferably small organic compounds having a molecular 
weight of more than 50 and less than about 2,500 daltons. Candidate agents 
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comprise functional groups necessary for structural interaction with proteins, 
particularly hydrogen bonding, and typically include at least an amine, carbonyl, 
hydroxyl or carboxyl group, preferably at least two of the functional chemical groups. 
The candidate agents often comprise cyclical carbon or heterocyclic structures and/or 
aromatic or polyaromatic structures substituted with one or more of the above 
functional groups. Candidate agents are also found among biomolecules including 
peptides, saccharides, fatty acids, steroids, purines, pyrimidines, derivatives, 
structural analogs or combinations thereof. 

Candidate agents are obtained from a wide variety of sources including 
libraries of synthetic or natural compounds. For example, numerous means are 
available for random and directed synthesis of a wide variety of organic compounds 
and biomolecules, including expression of randomized oligonucleotides and 
oligopeptides. Alternatively, libraries of natural compounds in the form of bacterial, 
fungal, plant and organism extracts are available or readily produced. Additionally, 
natural or synthetically produced libraries and compounds are readily modified 
through conventional chemical, physical and biochemical means, and may be used to 
produce combinatorial libraries. Known pharmacological agents may be subjected to 
directed or random chemical modifications, such as acylation, alkylation, 
esterificatlon, amidification, etc. to produce structural analogs. 

Where the screening assay is a binding assay, one or more of the molecules 
may be joined to a label, where the label can directly or indirectly provide a 
detectable signal. Various labels include radioisotopes, fluorescers, 
chemiluminescers, enzymes, specific binding molecules, particles, e.g. magnetic 
particles, and the like. Specific binding molecules include pairs, such as biotin and 
streptavidin, digoxin and antidigoxin etc. For the specific binding members, the 
complementary member would normally be labeled with a molecule that provides for 
detection, in accordance with known procedures. 

A variety of other reagents may be included in the screening assay. These 
include reagents like salts, neutral proteins, e.g. albumin, detergents, etc that are 
used to facilitate optimal protein-protein binding and/or reduce non-specific or 
background interactions. Reagents that improve the efficiency of the assay, such as 
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protease inhibitors, nuclease inhibitors, anti-microbial agents, etc. may be used. The 
mixture of components are added in any order that provides for the requisite binding. 
Incubations are performed at any suitable temperature, typically between 4 and 40° 
C. Incubation periods are selected for optimum activity, but may also be optimized to 
5 facilitate rapid high-throughput screening. Typically between 0.1 and 1 hours will be 
sufficient. 

The compounds having the desired biological activity may be administered in 
an acceptable carrier to a host. The active agents may be administered in a variety 
of ways. Depending upon the manner of introduction, the compounds may be 

10 formulated in a variety of ways. The concentration of therapeutically active 
compound in the formulation may vary from about 0.01-100 wt.%. 

It must be noted that as used herein and in the appended claims, the singular 
forms "a", "and", and "the" include plural referents unless the context clearly dictates 
othenwise. Thus, for example, reference to "a complex" includes a plurality of such 

15 complexes and reference to "the formulation" includes reference to one or more 
formulations and equivalents thereof known to those skilled in the art, and so forth. 

Unless defined othenwise, all technical and scientific temns used herein have 
the same meaning as commonly understood to one of ordinary skill in the art to which 
this invention belongs. Although any methods, devices and materials similar or 

20 equivalent to those described herein can be used in the practice or testing of the 
invention, the preferred methods, devices and materials are now described. 

All publications mentioned herein are incorporated herein by reference for the 
purpose of describing and disclosing, for example, the methods and methodologies 
that are described in the publications which might be used in connection with the 

25 presently described invention. The publications discussed above and throughout the 
text are provided solely for their disclosure prior to the filing date of the present 
application. Nothing herein is to be construed as an admission that the inventors are 
not entitled to antedate such disclosure by virtue of prior invention. 

30 The following examples are put forth so as to provide those of ordinary skill in 

the art with a complete disclosure and description of how to make and use the 
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subject invention, and are not intended to limit tlie scope of what is regarded as the 
invention. Efforts have been made to ensure accuracy with respect to the numbers 
used (e.g. amounts, temperature, concentrations, etc.) but some experimental errors 
and deviations should be allowed for. Unless otherwise indicated, parts are parts by 
weight, molecular weight is average molecular weight, temperature is in degrees 
Celsius, and pressure is at or near atmospheric. 

Experimental 

Cloning and Characterization Arabidopsis thaliana Genes. 

Following DNA isolation, sequencing was performed using the Dye Primer 
Sequencing protocol, below. The sequencing reactions were loaded by hand onto a 
48 lane ABI 377 and run on a 36 cm gel with the 36E-2400 run module and 
extraction. Gel analysis was performed with ABI software. 

The Phred program was used to read the sequence trace from the ABI 
sequencer, call the bases and produce a sequence read and a quality score for each 
base call in the sequence., (Ewing et al. (1998) Genome Research 8:175-185; Ewing 
and Green (1998) Genome Research 8:186-194.) PolyPhred may be used to detect 
single nucleotide polymorphisms in sequences (Kwok et al. (1994) Genomics 25:615- 
622; Nickerson et al. (1997) Nucleic Acids Research 25(14):2745-2751 .) 

Microwave Plasmid Protocol: Fill Beckman 96 deep-well growth blocks with 1 ml of 
TB containing 50 //g of ampicillin per ml. Inoculate each well with a colony picked 
with a toothpick or a 96-pin tool from a glycerol stock plate. Cover the blocks with a 
plastic lid and tape at two ends to hold lid in place. Incubate overnight (16-24 hours 
depending on the host stain) at 37° C with shaking at 275 rpm in a New Brunswick 
platform shaker. Pellet cells by centrifugation for 20 minutes at 3250 rpm in a 
Beckman GS-R6K, decant TB and freeze pelleted cell in the 96 well block. Thaw 
blocks on the bench when ready to continue. 

Prepare the MW-Tween20 solution 
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For four blocks: 



For 16 blocks: 



50ml STET/TWEEN20 



200ml STET/TWEEN 



2 tubes RNAse (10mg/ml,600ulea) 
1 tube lysozyme (25mg) 



8 tubes RNAse 



4 tubes lysozyme 



5 



Pipette RNAse and Lysozyme into the corner of a beaker. Add Tween 20 
solution and swirl to mix completely. Use the Multidrop (or Biohit) to add 25ul of 
sterile H2O (from the L size autoclaved bottles) to each well. Resuspend the pellets 
by vortexing on setting 10 of the platform vortexer. Check pellets after 4 min. and 

10 repeat as necessary to resuspend completely. Use the multidrop to add 70 //I of the 
freshly prepared MW-Tween 20 solution to each well. Vortex at setting 6 on the 
platform vortex for 1 5 seconds. Do not cause frothing. 

Incubate the blocks at room temperature for 5 min. Place two blocks at a time 
in the microwave (1000 Watts) with the tape (placed on the HI to HI 2 side of the 

1 5 block) facing away from each other and tum on at full power for 30 seconds. Rotate 
the blocks so that the tapes face towards each other and tum on at full power again 
for 30 seconds. 

Immediately remove the blocks from the microwave and add 300 //I of sterile 
ice cold H2O with the Multidrop. Seal the blocks with foil tape and place them in an 
20 HsO/ice bath. 

Vortex the blocks on 5 for 15 seconds and leave them in the H20/lce bath. Return to 

step 7 until all the blocks are in the ice water bath. Incubate the blocks for 15 

minutes on ice. Spin the blocks for 30 minutes in the Beckman GS-6KR with GH3.8 

rotor with Microplus carrier at 3250rpm. 
25 Transfer 100 ^1 of the supernatant to Corning/Costar round bottom 96 well 

trays. Cover with foil and put into fridge if to be sequenced right away. If not to be 

sequenced in the next day, freeze them at -20° C. 

Dye Primer Sequencing: Spin down the DP brew trays and DNA template by 

pulsing in the Beckman GS-6KR with GH3.8 rotor with Microplus carrier. Big Dye 
30 Primer reaction mix trays (one 96 well cycleplate (Robblns) for each nucleotide), 3 

microliters of reaction mix per well. 
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Use twelve channel pipetter (Costar) to add 2 |il of template to one each 

G,A,T,C, trays for each template plate. Pulse again to get both the reaction mix and 

template into the bottom of the cycle plate and put them into the MJ Research DNA 

Tetrad (PTC-225). 
5 Start program Dye-Primer. Dye-primer is: 

96° C, 1 min 1 cycle 

96° C, 10 sec. 

55° C, 5 sec. 

70° C, 1 min 15 cycles 
10 96° C, 10 sec. 

70° C, 1 min. 15 cycles 

4° C soak 

When done cycling, using the Robbins Hydra 290 add 100 \x\ of 100 % ethanol to the 
A reaction cycle plate and pool the contents of all four cycle plates into the 

15 appropriate well. 

To perform ethanol precipitation: Use Hydra program 4 to add 100 ^i! 100% 
ethanol to each A tray. Use Hydra program 5 to transfer the ethanol and therefore 
combine the samples from plate to plate. Once the G, A, T, and C trays of each 
block are mixed, spin for 30 minutes at 3250 In the Beckman. Pour off the ethanol 

20 with a firm shake and blot on a paper towel before drying in the speed vac (-10 
minutes or until dry). If ready to load add 3 ^il dye and denature in the oven at 95° C 
for ~5 minutes and load 2 ^il. If to store, cover with tape and store at -20°C. 

Common Solutions 
25 Terrific Broth 

Per liter: 

900 ml H2O 

12 g bacto tryptone 

24 g bacto-yeast extract 
30 4 ml glycerol 
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Shake until dissolved and tlien autoclave. Allow the solution to cool to 60° C or less 
and then add 100 ml of sterile 0.1 7M KH2PO4, 0.72M K2HPO4 (in the hood w/ sterile 
technique). 

0.1 7M KH2PO4, 0.72M K2HPO4 

Dissolve 2.3 1g of KH2PO4 and 12.54g of K2HPO4 in 90 ml of H2O. 
Adjust volume to 100 ml with H2O and autoclave. 
Sequence loading Dye 
20 ml deionized formamide 
3.6 ml dHaO 

400 nl 0.5M EDTA, pH 8.0 
0.2 g Blue Dextran 

*Light sensitive, cover in foil or store in the dark. 

STET/TWEEN 

10ml5M NaCI 

5 mil M Tris, pH 8.0 

1 ml 0.5M EDTA., pH 8.0 

25ml Tween20 

Bring volume to 500 ml with H2O 

The sequencing reactions are run on an ABI 377 sequencer per manufacturer's' 
instructions. The sequencing information obtained each run are analyzed as follows. 

Sequencing reads are screened for ribosomal., mitochondrial., chloroplast or 
human sequence contamination.. In good sequences, vector is marked by x's. 
These sequences go into biolims regardless of whether or not they pass the criteria 
for a 'good' sequence. This criteria is >= 100 bases with phred score of >=20 and 15 
of these bases adjacent to each other. 

Sequencing reads that pass the criteria for good sequences are downloaded 
for assembly into consensus sequences (contigs). The program Phrap (copyrighted 
by Phil Green at University of Washington, Seattle, WA) utilizes both the Phred 
sequence information and the quality calls to assemble the sequencing reads. 
Parameters used with Phrap were determined empirically to minimize assembly of 
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chimeric sequences and maximize differential detection of closely related members 
of gene families. Tlie following parameters were used with the Phrap program to 
perform the assembly: 



Penalty 


-6 


Penalty for mlsmatches(substitutions) 


Minmatch 


40 


Minimum length of matching sequence to use in assembly of 
reads 


Trim penalty 


0 


penalty used for identifying degenerate sequence at beginning 
and end of read. 


Minscore 


80 


Minimum alignment score 


Results from t 


ie Phrap analysis yield either contigs consisting of a consensus of two 



5 or more overlapping sequence reads, or singlets that are non-overlapping . 

The contig and singlets assembly were further analyzed to eliminate low 

quality sequence utilizing a program to filter sequences based on quality scores 

generated by the Phred program. The threshold quality for "high quality" base calls is 

20. Sequences with less than 50 contiguous high quality bases calls at the beginning 
10 of the sequence, and also at the end of the sequence were discarded. Additionally, 

the maximum allowable percentage of "low quality base calls in the final sequence is 

2%, otherwise the sequence is discarded. 

The stand-alone BLAST programs and Genbank databases were downloaded 

from NCBI for use on secure servers at the Paradigm Genetics, Inc. site. The 
15 sequences from the assembly were compared to the GenBank NR database 

downloaded from NCBI using the gapped version (2.0) of BLASTX. BLASTX 

translates the DNA sequence in all six reading frames and compares it to an amino 

acid database. Low complexity sequences are filtered in the query sequence. 

(Altschul et al. (1997) Nucleic Acids Res 25(17):3389-402). 
20 Genbank sequences found in the BLASTX search with an E Value of less than 

1e"''° are considered to be highly similar, and the Genbank definition lines were used 

to annotate the query sequences. 

When no significantly similar sequences were found as a result of the BLASTX 

search, the query sequences were compared with the PROSITE database (Bairoch, 
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A. (1992) PROSITE: A dictionary of sites and patterns in proteins. Nucleic Acids 

Researcli 20:2013-2018. )to locate functional motifs. 

Query sequences were first translated in six reading frames using the 

Wisconsin GCG pepdata program (Wisconsin Package Version 10.0, Genetics 
5 Computer Group (GCG) , Madison, Wisconsin, USA. ). The Wisconsin GCG motifs 

program (Wisconsin Package Version 10.0, Genetics Computer Group (GCG) , 

Madison, Wisconsin, USA.) was used to locate motifs in the peptide sequence, with 

no mismatches allowed. Motif names from the PROSITE results were used to 

annotate these query sequences. 
10 Table 1 



SEQID 


Reference 


Annotation 


1 


2028001 


Tvr Phospho Site(512-519) 


2 


2028002 


1E-30 >gi|4220454 (AC006216) Similarto gi|3413714 T19L18.21 
myrosinase-binding protein from Arabidopsis thaliana BAC gb|AC004747. ESTs 
gb|65870 and gb|T20812 come from this gene. [Arabidopsis thaliana] Length = 
303 


3 


2028003 


1 E-133 >sp|P43297|RD21_ARATH CYSTEINE PROTEINASE RD21 A 

3.4.22.-) RD21 A precursor - Arabidopsis tliaiiana >g 1 143561 9 |dbj|BAA02374| 

^L* iOU^O) LIIIUI piUlcaoc [rVI clUiUu]Joio lllallallcaj i_diyLii 


A 

4 


ZUZOUU4 


f^P-fin >nhiAAn^RQQft ilAnrinQ4fi'^ 19 /ACnnQ465'^ mitoaen activated Drotein 

I\ll IClOC? r\M ICIOO [rAI CilJIUL^|JOIO LI IdMCll laj 1— lyil 1 I \J\J 




2028005 


1 E-28 >gb|AAD36643.1 |AE001 802_1 2 {AE001 802) hemolysin [Thermotoga 
maritima] Lengtli = 267 


6 


2028006 


4E-41 >emb|CAA72903| (Y12227) topoisomerase [Arabidopsis thaliana] 
Length = 618 


7 


2028007 


1 E-103 >emb|CAB36783.1 1 (AL035525) aminopeptidase-like protein 
[Arabidopsis thalianal Length = 873 


8 


2028008 


2E-26 >sp|P46810|GUAA IMYCLE GMP SYNTHASE [GLUTAIVIINE- 
HYDROLYZING] (GLUTAIVIINE AMIDOTRANSFERASE) (GiVIP SYNTHETASE) 
>gi|2145847|pirl|S72813 GMP synthase (glutamine-hydrolysing) (EC 6.3.5.2} 
guaA - Mycobacterium leprae >gi|466934 (U00015) guaA; B1620_C2_205 
[Mycobacterium leprae] Length = 590 


9 


2028009 


Tyr_Phospho_Site(706-71 3) 


10 


2028010 


2E-33 >gb|AAD42941.1|AF091621__1 (AF091621) ubiquitin-conjugating enzyme 
E2 [Catharanthus roseusl Length = 153 


11 


202801 1 


1 E-14 >gi|2829899 (AC00231 1) similar to ripening-induced protein, 
gplAJO0 1449 124650 15 and major#latex protein, gp|X91961|1 107495 [Arabidopsis 
thaliana] Length = 160 


12 


2028012 


Tyr_Phospho_Site{900-908) 


13 


2028013 


2E-37 >emb|CAB52246.1 1 (AJ245478) alpha galactosyltransferase 
[Trigonella foenum-graecum] Length = 438 


14 


2028014 


Tyr Phospho Site(181-187) 


15 


2028015 


Rqd(201-203) 


16 


2028016 


3E-70 >sp|Q08770lRL10 ARATH 60S RIBOSOMAL PROTEIN L10 (WILM'S 



44 







TUMOR SUPPRESSOR PROTEIN HOMOLOG) >gi|478401|pir||JQ2244 
ribosomal protein L10.e, cytosolic - Arabidopsis thaliana 
>nih7fiR2lpmblCAA78856l fZ15157'> Wilm's tumor suDPressor homoloque 

fArahidoDsis thalianal Lenath = 220 


1 f 




1 E-80 >gi 12924779 (AC002334) 3-ketoacyl-CoA thiolase [Arabidopsis 
thalianal >ail2981616lclb||BAA25248| (AB008854) 3-ketoacyl-CoA thiolase 
[Arabidopsis thaliana] >gi|2981618|dbj|BAA25249| (AB008855) 3-ketoacyl-CoA 
thiolase [Arabidopsis thaliana] Length = 462 


1ft 


2028018 


3* Tyr Phospho Site(224-232) 


19 


2028019 


3* Pkc Phospho Site(35-37) 


20 


2028020 


5' Pkc Phospho Site(86-88) 


21 


2028021 


5' 3E-21 >gi|3123745ldbjlBAA25999| (AB013447) aluminum-induced 




ZUZoUZZ 


Tvr Phn<5nho Sitef21 1-21 8") 


23 


2028023 


5' 3E-32 >gi|451812|sp|Q05047|CP72_CATRO CYTOCHROME P450 72A1 
CPYPI XXIh fPROBABLE GERANIOL-10-HYDROXYLASE) (GE10H) >gi|1 67484 
(LI 0081) Cytochrome P-450 protein [Catharanthus roseus] 
>Qi|445604lprf||1909351 A cytochrome P450 [Catharanthus roseus] Length = 524 


24 


2028024 


5' Tyr Phospho Site(825-833) 


25 


2028025 


5* 2E-75 >gi|4006827|gb|AAC951 69.1 1 (AC005970) subtilisin-like protease 

[AraulQOpSiS indlldilcJJ t-cliyill — t oh- 


26 


2028026 


5E-40 >gi|135915|sp|P284931PR5_ARATH PATHOGENESIS-RELATED 
DpnxPiM R PRPPi IP^OR ^PR-'S^ >niR9?^^QInirlLj01695 oathoaenesis-related 
protein 5 precursor - Arabidopsis thaliana >gi|1 66865 {M90510) thaumatin-like 
protein [Arabidopsis thaliana] >gi|1448919 (L78079) thaumatin-like protein 
[Arabidop 




ZUZOUZ/ 


RP 9A >nhiAAni'S'^Qni ^AC006223^ suoar starvatlon-lnduced orotein 
[Arabidopsis thaliana] Length = 256 


28 


2028028 


9E-34 >splQ39230|SYS ARATH SERYL-TRNA SYNTHETASE (SERINE— 
TRNA LIGASE) (SERRS) >gi|2129737|pir|lS71293 seryl-tRNA synthetase - 
AroKiHrtneio thoiiano ^nil 1 '^'^QAQ7lpmhlP AAQ4'^Rfil ^770^^13^ servl-tRNA 

C\/nfhotc»oo FArahirinnQiQ thalian^il 1 pnnth ~ 4fS1 
oynineiaSc [MraUIUUpolo UlcillclMaj i-t?Myui 1 




ZUZoUZy 


4P ^7 >cinlP9lf^9ftlMr>Nr PFA MALATE DEHYDROGENASE FNADP], 
CHLOROPLAST PRECURSOR (NADP-MDH) >gi|481222|pir||S38346 malate 
riphvdrnnpnac;p fNADP+> fEC 111 82^ - aarden 068 >ai|397475|ennb|CAA52614| 
(X74507) malate dehydrogenase (NADP+) [Pisum sativum] Length = 441 


30 


2028030 


Rqd(1 079-1 081) 




oAopn'1'1 

ZUZOUO i 




32 


2028032 


3E-23 >emb|CAB1 01541 (Z9721 1) probable involvement in ergosterol 
synthesis fSchizosaccharomyces pombe] Length = 1213 


33 


2028033 


1 E-1 02 >dbj|BAA28531 1 (D78598) cytochrome P450 monooxygenase 
[Arabidopsis thaliana] >gi|5262761|emb|CAB45909.1| (AL080283) cytochrome 

D/IJ^n mrtnonvv/noncico FArahiHrinQiQ thj^ManAl 1 pnnth — 499 
r'tOU monOUXygci laoc [MlaUiUUIJolo lMctlic3i]oj 


34 


2028034 


5E-36 >sp|Q42885|ARC2 LYCES CHORISMATE SYNTHASE 2 
PRECURSOR (5-ENOLPYRUVYLSHIKIMATE-3-PHOSPHATE 
PHOSPHOLYASE 2) >gil542027|pir||S40409 chorismate synthase (EC 4.6.1 .4) 2 
precursor - tomato >gi|410484|emblCAA79854| (Z21 791) chorismate synthase 2 
[Lycopersicon esculentum] Length =431 


35 


2028035 


Tyr_Phospho„Site(1 9-25) 


36 


2028036 


1 E-1 23 >emb|CAA1 9688, 1 1 (AL024486) aspartate kinase-homoserine 
dehydrogenase-like protein [Arabidopsis thaliana] Length = 916 


37 


2028037 


2E-23 >gb|AAD48585.1 1 (AP1 1 0645) candidate tumor suppressor p33 
ING1 homoiog [Homo sapiens] Length = 249 


38 


2028038 


Tyr_Phospho_Site(939-945) 



45 



39 


2028039 


1E-49 >gi|1619956 (U721 51) voltage-gated chloride channel 
[Arabidopsis thalianal Length = 773 


40 


2028040 


1E-22 >gi|2338712 (AF013959) metallothionein-like protein [Arabidopsis 
thaliana] Length = 69 


41 


2028041 


Pkc_Phospho_Site(45-47) 


42 


2028042 


5E-49 >splQ05047|CP72 CAIRO CYTOCHROME P450 72A1 (CYPLXXIi) 
(PROBABLE GERANIOL-10-HYDROXYLASE) (GE10H) >gi|1 67484 (L10081) 
Cytochrome P-450 protein [Catharanthus roseus] >gil445604|prfl|1909351 A 
cytochrome P450 [Catharanthus roseus] Length = 524 


43 


2028043 


Pkc_Phospho_Sjte(62-64) 


4.4 




3' 1E-34 >gi|6016708|gb|AAF01 534.1 1AC009325 4 (AC009325) protein 
kinase FArabidoDsis thalianal Lenath =411 


45 


2028045 


3' Tyr Phospho Site(46-53) 


46 


2028046 


3' Tyr Phospho Site(297-304) 




ZUZOUH- 1 


V T\/r Phninhn 9itp/fi7'i-Rfi'^^ 


48 


2028048 


3' Tyr Phospho Site(77-84) 






*V T\/r Phncnhn <^itfi^7'^J.-7^ln^ 

o 1 yr "nospno oiit;^/ oh— / h-u^ 


50 


2028050 


5' 3E-76 >gj|286461 3|emb|CAA1 6960| {AL021 81 1 ) S-receptor kinase -like 
protein [Arabidopsis thaliana] >gi|4049333|emb|CAA22558.1| (AL034567) S- 
receptor kinase-like protein [Arabidopsis thaliana] Length = 778 


51 


2028051 


5^ 3E-48 >gil1 514643|emb|CAA94437| (Z70524) PDR5-like ABC transporter 
[Spirodela polyrrhizal Length = 1441 


52 


2028052 


Pkc_Phospho_Site(1 8-20) 




ZUitioUoo 


/ t-OU -^Sp ViJU f UO 1 |L 1 MM MCiaJM L.-ML_I_^^- I nr\ClO'tNMNC MI_LJV-'L-M0C- 

ALLO-TA) (L-ALLO-THREONINE ACETALDEHYDE-LYASE) 

^□1 Z 1 yUZ/ ZjUUj jDMMZUH-UT'l \\JO(Oi3\J) l_-allU-U 11 fcJUi IH ic dlUUIdoo [/nd Ul 1 lUI lao 

jandaei] Length = 338 


54 


2028054 


8E-68 >gb|AAD46410.1 |AF096260_1 {AF096260) ER66 protein [Lycopersicon 
escuieniumj Lengin — ooo 


OO 


^UZcJUOO 


ytZ-OU •^UDJ|DMMi H-OOy| ^MDV/Z liyOHf 1 llUULlcil Idt 1 IN lo oyililiaoc; [AM auiuu[Joio 

fholional 1 onnth = '^OH 
irialiclilclj l-ciiyLli — O^U 


56 


2028056 


3E-67 >gi|2281645 (AF003103) AP2 domain containing protein 
RAP2.10 [Arabidopsis thaliana] >gi|2632063|emb|CAA05630.1| (AJ002598) 
TINY-like protein [Arabidopsis thaliana] Length = 259 


57 


2028057 


Tyr Phospho Site(473-481) 


58 


2028058 


Tyr Phospho oite(21 0-2^:0) 


59 


2028059 


4b-oy >sp|ro4oo1 |M 1 UM_Ar<A ! n UINM Y 1 UollNC-O^- 
Mt: 1 n Y L 1 KAlNor tKAot \UINM IVIt I n Y L I KMINorcrxMOCJ ^UINM IVICIMOCy 

>gi|1363480|pir||S59604 DNA (cytosine-5-)-methyltransferase (EC 2.1.1.37) - 
ArahiHnnQiQ thaliana >n\\'^C)A^ OT (\ lOfiQ?^ r\/tn*%lnp-'S mpthvltransfprase 
rArahiHnnQiQ thalianal 1 pnnth = 1 ^j^A 

dUlviU[JOIO 11 Idlldl IdJ U.\71 Iv^li 1 I \J^^ 


60 


2028060 


Tyr Phospho Site(426-434) 


O 1 








ZUZOUDZ 


7P-9ft >nili7n9ft79lpmhlPAA7nftfi91 ^YnQfifi7\ fprrpdoyln-ripnpnrie^nt 
nil itamatp cwnthaQP FArahirlnnQlQ thalianal 1 pnnth = 1fi48 


63 


2028063 


3' Pkc Phospho Site(4-6) 


d4 


ZUZoUD4 


alucuronvltransferase-like orotein FArabidoDsIs thalianal Lenath = 544 


65 


2028065 


y Pkc Phospho Site(48-50) 


66 


2028066 


5' Tyr Phospho Site(696-704) 


67 


2028067 


5' 3E-75 >gi|3738320 (AC005170) cinnamoyi CoA reductase 
[Arabidopsis thaliana] Length = 303 


68 


2028068 


5* Rgd(11-13) 



46 



by 


zuzioUDy 


e;' '^F-74 >ail134103lsDlP21240IRUBB ARATH RUBISCO SUBUNIT 

BINDING-PROTEIN BETA SUBUNiT PRECURSOR (60 KD CHAPERONIN 
BETA SUBUNIT) fCPN-60 BETA) Lenqth = 600 


70 




K' Tvr PhosDho Sitef407-414) 


71 


2028071 


5' 2E-1 5 >gi|31 57926 (AC0021 31 ) Strong similarity to extensin-like 
protein gb|Z34465 from Zea mays. [Arabidopsis thaliana] Length = 744 


79 


909R079 


Tvr Phosnho Sitef809-817) 


73 


2028073 


Pkc Phospho Site{15-17) 




909ftn7J. 


Pkr PhriQnhn <^itpM "^-1 'S) 

rlSU rllUojJlIU OlltS^ 1 0 


(0 


onoft07C\ 
ZUZOUf o 


'^F 1A >nhlAAn977'^'^ 1IAF1 32958 1 fAF132958) CGI-24 protein [Homo 
sapiens] Lenqtli = 241 


76 


2028076 


Tyr Phospiio Site(71-79) 


11 


2028077 


3E-41 >emb|CAB10236.1i (Z97336) acylaminoacyl-peptidase !il<e protein 
[AraDiaopsis inananaj LtsiiyLii — h-zu 


78 


2028078 


Pkc Phospho Site(66-68) 


79 


2028079 


Tc OO ^.rckfiMD noQOA'^ i IPTMPI 1 TATA pipmpnt moriiilatorv factor 1 
^nii/io'^i 1 oiniriiAAT^i? tmncirrintinn fertnr TMF TATA element modulatorv factor 
- hi !m?in >niR870866lablAAD54608 1 1 (L01 042) TATA element modulatory factor 
[Homo sap! 


on 
oU 


OOOROftO 
ZUZOUOU 


RP OA >HhiiRAA9^Q8Ql (089051) ERD6 orotein FArabidoDsis thaliana! 
Length = 496 


Q-l 


ZUZCJUO 1 


P^n Phncnhn ^itp/'^4-'^fi) 


oo 
oz 


ZUZC5UC>Z 


Oifn Phncnhn CJito^'>AR_94ft^ 
nKC "nOopilO OIIc^ZH-u-^'tO^ 


83 


zUzoUoo 


^^ ^fi ^nii'^ftft'^i9n (AFnR99Qfi) arablnoaalactan-orotein FArabidoDsis 
thaliana] Lenqth = 131 


OA 


ZUzoU<54 


1 p '^o >nhl AAnpfiPO'^ 1 1AF1 1 7267 1 f AF1 1 7267) UDP alucose:flavonoid 3-0- 
niiirnQul trancifprfl^p FMalij^; domestical Lenoth = 483 


85 


2028085 


Tyr Phospho Site(497-504) 


ob 


^UzoUoO 


0Un Phncnhn ^itp/'^Q'^-'^Q'^) 


87 


2028087 


3' 8E-25 >gi|40931 55 {AF088281 ) phytochrome-associated protein 1 
[Arabidopsis thaliana] Length = 267 


88 


2028088 


3' Pkc Phospho Sited 8-20) 


89 


2028089 


o' 4 yi n •»^*-^;!/mncQcni»mKir* AD-I f577P •! 1 /7QQ7n7\ thinl-HiQi ilfiHp intprf*h3nnp 
o 1 b:-4U >gi|4UUboDU|emD|OAt3 lO/ / o. 1 1 v^yy/u/ ) LI iiui-uioUMiut; iiiiciuiicii yyc 

like protein [Arabidopsis thaliana] Length = 261 


90 


2028090 


oj -10 ■^>-.;icooc/inQi*?r%ioo7Q*^c;ir:i ATA APPPI 1 PRORARI F r^l I ITAMYII - 

3 OC-1 Z >gi|DZZ04Uy |Sp|UZ / ytDOjoA 1 A AKOTU r^r\WDMDL.I_ oi_u 1 /-\ivi T lt_ 

TRNA(GLN) AlWIDOTRANSFEFRASE SUBUNIT A (GLU-ADT SUBUNIT A) 
>gi|2648182 (AE000943) Glu-tRNA amidotransferase, subunit A (gatA-2) 
[Archaeoqlobus fulgidus] Length = 457 


91 


2028091 


3' 7E-57>gi|4510424|gb|AAD21510.1l (AC006929) carboxypeptidase 
[Arabidopsis thaliana] Length = 361 


92 


2028092 


3' Pkc_Phospho_Site(127-129) 


93 


zuzouyo 


9P 70 >nililfiQ'^QftknlP4fi'^1'^IFnfiF ARATH O M EG A-6 FATTY ACID 
DESATURASE, ENDOPLASMIC RETICULUM (DELTA-12 DESATURASE) 
>qi|438451 {L26296) delta-12 desaturase [Arabidopsis thalianal Lenqth = 383 


94 


zUzoUy4 


f;' Olrn Phncnhn ^iffi^'^n-'^9^ 

D rKC "nospno OIlc^OU-O^^ 


95 


z0zoU9o 


K' X\/r Dhncnhn Qito/^QA-1 01 \ 

o 1 yr nnospno oiio^y**- lu i ^ 


96 


2028096 


5' Tyr Phospho Site(479-486) 


97 


2028097 


c» ric ^7 >»«ilfil7/lQ'^nienini'^9nniPc;r^9 HIJMAN ?fiS PROTFASOME 
REGULATORY SUBUNIT S2 (P97) (TUMOR NECROSIS FACTOR TYPE 1 
RECEPTOR ASSOCIATED PROTEIN 2) (55.11 PROTEIN) Length = 908 


98 


2028098 


5' Tyr Phospho Sited 02-1 09) 


99 


2028099 


5' 2E-28 >gi|2735764 (AF008651) MADS transcriptional factor; 
STMADS16 fSolanum tuberosuml Length = 234 


100 


2028100 


5E-28 >qb|AAD43611.1|AC005698 10 (AC005698) T3P18.10 [Arabidopsis 



47 







thplianal 1 finath = 482 


101 
1 U 1 


9098101 


Tvr Phnsnho Site^61 1-618) 






9E-34 >gi|3687235 (AC005169) copia-like transposable element 
[Arabidopsis thaliana] Length = 213 


103 


2028103 


3E-80 >emb|CAA761 78.1 1 {Y1 6327) cyclic nucleotide-regulated ion 
channel [Arabidopsis thalianal Length = 716 


104 


2028104 


Tyr Phospho Site(1 098-1 104) 


1 0*^ 




Tvr Phoqoho Sitef164-172) 


106 


2028106 


Pkc Phospho Site{15-17) 


"1 OT 


909ft1 07 


1F-1 9fi >pmblCAA1 75501 fAL021951) receptor protein kinase - like 
protein [Arabidopsis thaiiana] Length = 980 


108 


2028108 


1 E-51 >dbj|BAA77337.1 1 (AB019533) Nad-dependent formate 
dehydrogenase [Oryza satival Length = 376 


109 


2028109 


2E-34 >emb|CAB1 6828.1 1 (Z99708) splicing factor-like protein [Arabidopsis 
thaliana] Length = 573 


110 


2028110 


Tyr_Phospho_Site(69-76) 


AAA 

111 


20zo1 1 1 


7P c:a ^cninQfif^'^'^IAni-l'^ ARATH GLUTATHIONE-DEPENDENT 
FORMALDEHYDE DEHYDROGENASE (FDH) (FALDH) (GSH-FDH) 
>gi|1498024 (U63931 ) glutathione-dependent formaldehyde dehydrogenase 
[Arabidopsis thaliana] Length = 379 


112 


2028112 


2E-77 ) >emb|CAB1 06981 (Z97558) argininosuccinate lyase [Arabidopsis 
thaliana] Length = 517 


113 


2028113 


Tyr_Phospho_Site(375-382) 


AAA 

1 14 


2u2o1 14 


9Q •i>niriiAA9i J^o P-nlvrnnmtpln atoaol - ArabidoDsis thaliana 
>gi|3849833|emb|CAA43646| (X61370) P-glycoprotein [Arabidopsis thaliana] 
>gi|4883607|gb|AAD31 576.1 |AC006922_8 (AC006922) P-glycoprotein pgpl 
[Arabidopsis thaliana] Length = 1286 


115 


2028115 


5E-46 >emb|CAB56614.1 ] (AJ234901 ) acetolactate synthase small subunit 
[Nicotiana piumbaginifoiia] Length = 449 


lib 


zUzcJI ID 




117 


2028117 


3' 2E-18 >gi|3249071 (AC004473) Contains similanty to protein- 
tyrosine phosphatase 2 gb|L15420 from Dictyostelium discoideum. EST 
gb|N38718 comes from this g [Arabidopsis thaliana] Length = 547 


118 


2028118 


3' Tyr_Phospho_Site(25-33) 


ny 


909ft1 1 Q 


r 1F-9fi >nil4'531441lablAAD22126 1IAG006224 8 (AC006224) 
pectinesterase [Arabidopsis thalianal Length = 518 


120 


2028120 


3' Pkc Phospho Site(61-63) 


A *^A 

121 


2U2olZ ! 


c:» T\/r Phnonho QitP^9fi-'^4^ 


122 


2028122 


c;' 9P >nii'^09i97QlpmhlCAA18474 11 CAL022347) serine/threonine kinase 


123 


2028123 


5' 1E-41 >gi|5454072|ref|NP_006416.1|pSLU7| step II splicing factor SLU7 
>nii49AQ70'^inhlAAD 13774 11 (AF101074) steo li SDlicinq factor SLU7 [Homo 
<;anipn^l 1 pnoth = 586 


1 9A 


909ft 19A 


Tvr Phnsnho Sitef469-477^ 


125 


2028125 


5' Pkc Phospho Sited 03-1 05) 


126 


2028126 


1E-69 >emblCAA1 75591 (AL021961) glucosyltransferase -like protein 
[Arabidopsis thalianal Length =478 


127 


2028127 


Pkc Phospho Site(1-3) 


128 


2028128 


Tyr Phospho Site(959-965) 


129 


2028129 


1E-50 >gi|1432083 (U60981) homolog to Skp1p, an evolutionanly 
conserved kinetochore protein in budding yeast [Arabidopsis thaliana] 
>qi|3068807 {AF059294) Skpl homolog [Arabidopsis thaliana] >gi|3719209 



48 







(U97020) UIP1 [Arabidopsis thaliana] Length = 160 


130 


2028130 


Pkc Phospho Site(42-44) 


131 


2028131 


3E-30 >gi|1 73251 5 (U62744) myosin heavy chain-like protein 
[Arabidopsis thaliana] Length = 209 




Z\)d.O \ oz 


9F-77 >rihilRAA76297 1! (AB013912) DNA helicase FMus musculus] 
LsriQth — 463 


loo 


ZuZO 1 oo 


1 F-11 8 >sdIP32826ICBPX ARATH SERINE CARBOXYPEPTIDASE 
PRECURSOR >gi|1 66674 ImBU 30) carboxypeptidase Y-like protein [Arabidopsis 
thaliana] >gil445120|prf||1908426A carboxypeptidase Y [Arabidopsis thaliana] 
1 ftnnth = 539 


134 


2028134 


Pkc Phospho Site(67-69) 


135 


2028135 


Pkc Phospho Site(1-3) 


136 


zU2o lOb 


7P 7 A ^niriici'^7AQR nprnxidp«;e (EC 1 11 1 7) - ArabidoDsis thaliana 
>gi|405611 |emb|CAA50677| (X71794) peroxidase [Arabidopsis thaliana] Length = 
353 


137 


2028137 


1 E-17 >gil497174 (U07631) beta-hexosaminidase [Mus musculus] 
>gi|497196 (U07721) beta-hexosaminidase alpha-subunit [Mus musculus] Length 
= 528 


138 


2028138 


Tyr_Phospho_Site(722-729) 


139 


2028139 


2E-36 >gb|AAD1 44561 (AC005275) component of cytochrome B6-F 
complex [Arabidopsis thaliana] >gi|5725450|emb|CAB52433.1| (AJ243702) rieske 
irnn-«;i ilfi ir nrntpin nrecursor TArabidoDsis thaliana! Lenqth = 229 


140 


2028140 


Pkc Phospho Site(23-25) 


141 


ZUZo 1 4 1 




142 


OAOQ -1 yt O 
2Uzo14z 


9F '^9 >nil94QQ4Q8knl042962iPGKY TOBAC PHOSPHOGLYCERATb 
KINA<^E CYTOSOLIC >ail1 161602lemblCAA88840| (Z48976) phosphoglycerate 
kinase (PGK) [Nicotiana tabacum] Length = 401 


I40 




>ail3184098lemblCAA19311.1| (AL023777) coenzyme a synthetase 
r.9nhi7n€;flnr:haromvces Dombel Lenoth = 512 






3' Pkc Phospho Site(62-64) 


145 


2028145 


5' Pkc Phospho Site(26-28) 


A AG, 

14o 




^' '^F-SO >ail3415115 (AF081202) villin 2 [Arabidopsis thaliana] 

1 pnnth = t)7fi 


1 /IT 




Tvr Phn<;nhn Sitef658-666) 


148 


2028148 


5' Tyr Phospho Site(700-707) 


i4y 




iF-'^9 >nil'^lQ3316 ^AF069299) contains similarity to nucleotide sugar 
epimerases [Arabidopsis thaliana] Length = 430 


150 


2028150 


Tvr Phospho Site(304-310) 


151 


zUzol ol 


T\/r Phncnhn ^itA^7R4-779^ 


152 


2028152 


3E-44 >gb|AAD27568.1lAF114171_9 (AF114171) H beta 58 homolog [Sorghum 


153 


2028153 


8E-65 >gi|3249095 (AC0031 1 4) Contains similarity to dihydrofolate 
rprl!irta<;p (dfrl) nblL13703 from Schizosaccharomvces pombe. ESTs gblN37567 
and nhiT4'^n02 rome from this cene fArabidoDsis thaliana] Length = 550 


1 RA 




2E-78 >gi|2281085 (AC002333) CTR1 protein kinase isolog 
rArabidonsis thaliana! Lenath = 282 


155 


2028155 


2E-84 >emb|CAB43938.1 1 {AJ006349) endo-beta-1 ,4-glucanase [Fragaria 
Y ?*nflnfl^^fl1 i pnath = 620 


156 


2028156 


Tyr Phospho Site(253-260) 


157 


2028157 


Rqd(302-304) 


158 


2028158 


Tyr Phospho Site(762-769) 


159 


2028159 


8E-87 >gb|AAD21 729.1] (AC006931) citrate synthase [Arabidopsis 
thaliana] Length = 509 


160 


2028160 


Tyr Phospho Site(64-72) 



49 



161 


2028161 


3E-89 >sp|P42749|UBC5_ARATH UBIQUITIN-CONJUGATING ENZYME E2- 
21 KD 2 (UBIQUITIN-PROTEIN LIGASE 5) (UBIQUITIN CARRIER PROTbIN 5) 
Length = 185 


162 


2028162 


8E-91 >emb|CAA1 8628.11 (AL022580) pectinacetylesterase protein 
fArabidopsis thaliana] Length = 362 


1 DO 


9028163 


Receptor Cytokines 1(74-87) 


164 


2028164 


3' 6E-38 >gi|31 93301 (AF069298) Arabidopsis chloroplast outer 
envelope 86-like protein T10P11.19 (GB: AC002330) [Arabidopsis thaliana] 
Length = 1503 


165 


2028165 


3^ Rgd(776-778) 


166 


2028166 


3' 2E-13 >gi|4337011|gb|AAD1 8035.11 (AF1 19572) zinc-binding peroxisomal 
integral membrane protein [Arabidopsis thallanal Length = 381 


1 Of 




Fi' Tvr PhosDho Site(568-575) 


168 


2028168 


5' Pkc Phospho Site(100-102) 


169 


2028169 


Pkc Phospho Sited 5-1 7) 


170 


2028170 


4E-19>gb|AAD22663.1lAC006555_1 (AC006555) beta-1 ,3-glucanase 
rArflhiHnn<sk fhfllianal >ail4662638lablAAD26909 1IAC007233 1 (AC007233) 
beta-1 ,3-glucanase [Arabidopsis thaliana] Length = 473 


III 


ZUZo 1 1 1 


4P-ftR >nirll*=^442fi1 SRG1 orotein - Arabidopsis thaliana 
>gi|479047|emb|CAA55654| (X79052) SRG1 [Arabidopsis thaliana] 
>nil'S7'^47671nblAAD50032 1IAC007651 27 (AC007651) SRG1 Protein 
[Arabidopsis thaliana] Length = 358 


^ TO 


ZUZO 1 / z 


1F-9Q >nblAAD22656 1IAC007138 20 fAC007138) NifU-llke metallocluster 
assembly factor [Arabidopsis thaliana] Length = 174 


173 


2028173 


1 E-91 >gi|20621 58 (AC001 645) jasmonate inducible protein isolog 
rArahirinn<?i<? thaliana! Lenath — 300 

CtUIUWlJOlO LI tClllCll 'till U-WI lyt' ■ 


174 


2028174 


1E-101 >gb|AAF00639.11AC009540_16 (AC009540) methionine synthase 
rArahiHnnc;i«i thalianal Lpnath — 765 


175 


2028175 


2E-55 >sp|064765|UAP1 ARATH PROBABLE UDP-N- 
ArFTYLGLUCOSAMlNE PYROPHOSPHORYLASE >qi|3033397 (AC004238) 
unknown protein [Arabidopsis thaliana] Length = 502 


176 


2028176 


2E-20 >gi|1 762933 (U66263) tumor-related protein [Nicotiana tabacum] 
Length = 210 


177 


2028177 


2E-33 >gb|AAD24645.1|AC006220_1 (AC006220) symbiosis-related protein 
[ArabidODsis thaliana] Length = 120 


178 


2028178 


Tyr_Phospho_Site(600-606) 


179 


2028179 


8E-1 8 >gi|1 840425 (U36586) alcohol dehydrogenase [Vitis viniferaj 
1 pnnth = ^80 


1 ou 




Tvr Phosnho Sitef339-345) 


181 


2028181 


3' Tyr Phospho Site(368-375) 




OAOQ^ pO 


4E-68 >ail3914002lsDlO64948ILON1 ARATH MITOCHONDRIAL LON 
PROTEASE HOMOLOG 1 PRECURSOR >gi|2935279 (AF033862) Lon protease 
rArflhirinnsifi thallanal Lenath = 888 


1 oo 




5* Pkc Phospho Site(43-45) 


1 O'r 




5' 7E-51 >gi|3859659|emb|CAA20566.1| {AL031394) potassium transporter 
AtKT5p (AtKT5) [Arabidopsis thaliana] Length = 846 


185 


2028185 


5' Pkc Phospho Site{60-62) 


186 


2028186 


5* Rgd(273-275) 


187 


2028187 


Pkc Phospho Site(30-32) 


188 


2028188 


Pkc Phospho Site(57-59) 


189 


2028189 


4E-32 >gi|22751 96 (AC002337) water stress-induced protein, WSI76 
isolog [Arabidopsis thaliana] >gi|4630746|gblAAD26596.1 |AC007236_1 
(AC007236) water stress-induced protein [Arabidopsis thaliana] Length = 344 



50 



190 


2028190 


2E-14 >gi|2342666 (AF014502) seed coat peroxidase precursor 
[Glycine max] Length = 352 


191 


2028191 


Tyr_Phospho_Site(1 50-1 56) 


H no 


zuzoi yz 


fiP A1 i»cnlP9'^ftfi'^ll IRn ARATH URIOIJITIN-CONJUGATING ENZYME E2- 
17 KD 1 (UBIQUITIN-PROTEIN LIGASE 1) (UBIQUITIN CARRIER PROTEIN 1) 
>gi|1076424|pir||S43781 ubiquitin-conjugating enzyme UBC1 - Arabidopsis 

thaiiana >nil449'SQ4lnrlhl1 AA 


1 yo 


zuzo I yo 


>pmhlCAA67'S^1 1 (X99097^ oeroxidase FArabidoDsis thalianal 


194 


2028194 


1 E-1 58 >gi|3249096 (AC0031 1 4) IVlatch to mRNA for importin alpha-like 
nrntpin 4 ^imna4^ ablY14616 from A thaliana ESTs abtN96440. ablN37503. 
qb|N37498 and qb|T42198 come from this gene. [Arabidopsis thaliana] Length = 


195 


2028195 


Tyr Phospho Site(41-48) 


196 




1 yr r nospno oiie\oo-^ i j 


197 


2028197 


5E-29 >gi|2924788 (AC002334) similar to disease resistance protein 
[Arabidopsis thalianal Length = 191 


198 


2028198 


2E-54 >sp|P42804|HMA1 ARATH GLUTAMYL-TRNA REDUCTASE 1 
PRECURSOR (GLUTR) >gi|454359 (U03774) glutamyl-tRNA reductase 
[Arabidopsis thalianal Length = 543 




ZUZO 1 yy 


T Pkr Phn<?nho SiteM63-165^ 


200 


2028200 


y 8E-65 >gil6094242|sp|O23264|SBP_ARATH SELENIUIVI-BINDING 
PROTEIN >gi|2244759|emb|CAB10182.1| (Z97335) selenium-binding protein like 
rArflhirinn<;i<; thrilianal Lpnoth = 490 


201 


2028201 


3' Tyr_Phospho_Site(558-566) 


202 


2028202 


3' 1E-56 >gi|1483150|dbj|BAA12349| (D84417) monodehydroascorbate 
reductase [Arabidopsis thalianal Length = 533 


203 


2028203 


5' Tyr_Phospho_Site(569-575) 


204 


2028204 


5* 5E-43 >gi|5262222|emb|CAB45848.1 1 (AL080254) reticuline oxidase-like 
proiein [MraDicjopoio Lnciiidiicij t-ciiyui — oo^. 


205 


ZUzozUo 


O 1 t-Oy ^gi j'tOO / U 1 1 jyUIMMLJ l OUOO. I \ 1 I x?vJ / ^11 iw iJiiiUM ly pel UAioui B lai 
intcirirol mamhrano nrntoin rArahiHnnQlQ thalianal 1 Pnnth — ^^81 

inieyrai rneriiurciiic piuioni |^/-m au luujjoio ii loncii loj i-ciivjiii i 




ZUZOZUO 


^F-R1 >nil11RQfiniknlP4fi'^19IFDfiC ARATH O M EG A-6 FATTY ACID 
DESATURASE. CHLOROPLAST PRECURSOR >gi|493068 {U09503) 
chloroplast omeqa-6 fatty acid desaturase [Arabidopsis thaliana] Length = 418 




ZUZoZU/ 


KKC r nospno oiie^oo-DO/ 


208 


2028208 


Tyr Phospho Site(733-741) 


209 


2028209 


Tyr Phospho Site(648-656) 


210 


2028210 


3E-74 >gi|2347098 (U76845) ubiquitin-specific protease [Arabidopsis 
tnaiianaj '>gi|44yu/4Z|emD|OrADoC)yu'f. t| ^mluooa uoj uuitjuiuii-opcoiiio iJiuiccaoc 

^MIUDrO^ [Ml ctUIUUpolo 11 lalldl IctJ l-cl lyii i — 0/ i 


21 1 


ZUzoZ 1 1 


IF ftQ ^cnlPA7Q97IAP9 ARATH Fl ORAI HOMEOTIC PROTEIN APETALA2 
>ni\^^'^7C\Q (\ \^9^df^\ APFTAI A? nrotpin FArabidoDsis thalianal 
>gi|2464888|emb|CAB16765.1| (Z99707) APETALA2 protein [Arabidopsis 

11 iciiiCti icij i_d lyLii " "-poz- 




ZUZoZ i z 




z lo 


ZUZOZ 1 o 




214 

^ 1 *-r 


2028214 


Pkc PhosDho Sited 9-21) 


215 


2028215 


1E-59 >gi|2688830 (AF000952) sugar transporter [Prunus armeniaca] 
Length = 475 


216 


2028216 


Tyr Phospho Site(521-528) 


217 


2028217 


Tyr Phospho Site(1 176-1 183) 


218 


2028218 


Tyr Phospho Site(71 8-725) 



51 



219 


2028219 


Pkc Phospho Sited 47-1 49) 


220 


2uzo22U 


1 yr rnospno i:>\l^\£.\^^-£i£.^} 


221 


2U2o221 


or: OO ^onlDl lft'^9IMIA1 ARATH NITRATF RFDUCTASF 1 fNRI^ 

Zt"ZZ -^Spir 1 100Z|lNlr\l MrvM 1 n lNlir\r\lc \\\—]-J\^\^ \ r\\j l_ i yi^iixiy 

^nilAftfiTf^i lnirll^'^^99ft nitrafp rpriiirtac;p fNADH^ (EC 166 1^ 1 - ArabidODSis 
thaliana >gi|22757|emb|CAA79494| (Z19050) nitrate reductase [Arabidopsis 
thaliana] >gi|448286|prf||1916406A nitrate reductase [Arabidopsis thaliana] 
Lenqtli = 917 


222 


zU2o222 


1 yr rnospno oiie^z i /-zz^f; 


223 


2U2o22o 


1 yr rnospno oixevO'^ ""^/ 


224 


20zo224 


07 ^cnin'^QQft'^lPRI HFVRR FTHYI FNF-INDLJCIBLE PROTEIN HEVER 
^ni!9i9QQ'i '^iniriiQfinnj.T pth\/lpnp-rp<;nnn<^i\/p nrotpin 1 - Para rubber tree 
>ni 1900*^17 /ivift89^4'\ pthvlene-lnducible orotein [Hevea brasiliensisl Lenoth = 
309 


225 


2028225 


3* Pkc Pliospho Site(43-45) 


22d 


ZUZOZZD 




ZZf 


zUzoZZ/ 


i^' T\/r Phncnhn ^itp/^fi7Q-RftR^ 


228 


2028228 


5* Uch 2 1(102-117) 


229 


2028229 


Q ti"Zo •^yllZZZ^yoo ^mpuuh-z lo^ cuiyic;iic"Miociiouivc»j [^ai aLnuw|joio 
+hiaiian'ai i»nii999AQ'^^ ^AFnn49l7'i pthvlpnp-in«;pn«iitl\/p3 FArabidoDsIs thalianal 

LcliyLii OZO 


230 


2028230 


Tyr Phospiio Site(98-106) 


2o 1 


ZUzozoT 


RP OR *:>HhiiRA Aft9R'^7 11 ^HR'^I'^R^ Rpta-ti ihiilln fZinnla eleaansl Lenath = 
448 


232 


2028232 


Pkc Phospho Site(68-70) 


233 


2028233 


Tyr Phospho Site(71 8-726) 


234 


2028234 


OCT CO A AncTQTf^l /A inH'^'liQ^ nmfoin nhncnhafaQP 90 
ot-OZ >ennD|OAAUOO / 0| ^AJUUOI ly^ proieiil pflUbpildlaatJ ZV-/ 

[Arabidopsis tlialianal Length = 51 1 


235 


2028235 


otr *?"7 ■^/%.-.iD>icriC'i 1 ADD ADATU API IDIMIP PMROMl \C^\ FAQP RFDOY 
3E-o7 >Sp|r4oyo1 jAKr AKAIn ArUKIlNiO tlNUUI>iUULPMoC:-r\c:iJVJA 

PROTEIN (DNA-(APURINiC OR APYRIMIDINIG SITE) LYASE) 

>gi|472869|emb|CAA54234| (X76912) ARP protein [Arabidopsis thaliana] Length 

= 527 


236 


2028236 


5E-76 >gbiAAF00669.1|AC008153_21 (AC008153) unl<nown protein 
[Arabidopsis thaliana] Length = 797 


237 


2028237 


PI<c_Phospho__Site(45-47) 


238 


2028238 


1E-55 >emb|CAB38817.1| (AL035679) fructose-bisphosphate aldolase 
[Arabidopsis thaliana] Length = 343 


239 


2028239 


3E-56 >gb|AAD28617.1 |AF129087_1 (AF1 29087) mitogen-activated protein 


240 


2028240 


Pkc_Phospho_Site(1 7-1 9) 


241 


2028241 


4E-29>gb|AAF00639.1|AC009540_16 (AC009540) methionine synthase 

TArahiHrirvcic thalian?*! 1 Pnnth = 7R^ 
[MlciUiUOpolo LlialldiiaJ l_t?liyiii — / 


242 


2028242 


3' Pi<c_Phospho_Site(23-25) 


243 


2028243 


3' 5E-17>gi|5929906|gb|AAD56636.1|AF162150_1 (AF162150) C0P1- 

in+^ror^+inn nrritoin r^IPft FArahiHrinQiQ thaliflnpl 1 Pnnth — ^^^^4 

iriLerdOiiriy pioicNi v^iro [r\iduiuu|joio iiidiiciiictj i-diyui oo*t 


244 


2028244 


3' Tyr_Phospho_Site(566-573) 


245 


2028245 


3' 1 E-49 >gi|3256068|emb|CAA74397| (Y14068) Heat Shock Factor 3 
[Arabidopsis thaliana] Length = 520 


246 


2028246 


5' Pkc_Phospho_Site(165-167) 


247 


2028247 


5' 4E-26>gi|123078|sp|P13723|HEXA DICDI BETA-HEXOSAMINIDASE 
ALPHA CHAIN PRECURSOR (N-ACETYL-BETA-GLUCOSAMINIDASE) (BETA- 
N-ACETYLHEXOSAMINIDASE) >gi|84092|pir||A30766 beta-N- 
acetylhexosaminidase (EC 3.2.1 .52) A precursor - slime mold (Dictyostelium 
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discoideum) >qi|1 67841 (J04065) beta-N-acetyl 


248 


2028248 


5' Rqd(146-148) 






c;' Phn<5r)hn Sitef61-63) 


250 


2028250 


2E-37 >emb|CAA09371 .1| (AJ010829) GRAB1 protein [Triticum sp.] 
Length = 287 






-^F-Rd ><?nlP46644IAAT3 ARATH ASPARTATE AMINOTRANSFERASE, 
CHLOROPLAST PRECURSOR (TRANSAMINASE A) >gi|693692 {U15034) 
aspartate aminotransferase fArabidopsis thaliana] Length = 449 






7 >nhl AAD1 1 583 1 1AAD1 1 583 (AF071 527) hypothetical protein 
[Arabidopsis thaliana] >gi|4262169|gb|AAD14469| (AC005275) hypothetical 
nrntpin rArahirinn<5i<5 thalianal Lenath = 236 


253 


2028253 


1 E-58 >pir||S57478 small GTP-binding protein - garden pea 
>nil8715n8lpmblCAA90082l (Z49902) small GTP-bindinq protein [PIsum sativum] 
Length = 215 


254 


2028254 


2E-21 >emb|CAA16710.1| {AL021687) RNase L Inhibitor-like protein 
TArahirlnn^i^ thalianal L pnnth = 600 


255 


2028255 


1 E-35 >emb|CAA16929.1 1 (AL021768) resistance protein RPP5-like 

rArflhlHnn<;i^ thfliianal Lenoth = 1715 


256 


2028256 


8E-1 1 >sp|P49208|RK1 PEA SOS RIBOSOMAL PROTEIN LI . 
CHLOROPLAST PRECURSOR >gi|577089|emb|CAA58020| (X82776) 
chloroplast ribosomal protein L1 [Pisum sativum! Length = 208 


257 


2028257 


2E-39 >dbj|BAA25989| (D89051) ERD6 protein [Arabidopsis thaliana] 
Length = 496 


258 


2028258 


Pkc_Phospho_Site(71 -73) 


259 




vcr 77 f;oj^R7 ^APnn9Q8fi^ Similar to CREB-bindino orotein 
homolog gb|U88570 from D. melanogaster and contains similarity to callus- 
associated protein gb|U01961 from Nicotiana tabacum. EST gb|W43427 comes 
from this gene. fArabidopsis thalianal Length = 1516 


260 


2028260 


3E-27 >gil4038040 (AC005936) proteinase inhibitor II [Arabidopsis 
thaliana] Length = 77 


261 


zOzozbl 


f;p ^cniP9'^Rc;ilFlRP ARATH FRUCTOSE-1 6-BISPHOSPHATASE, 
PHI OROPI A<=sT PRFPURSOR fD-FRUCTOSE-1 6-BISPHOSPHATE 1- 
PHn^PHDHYDROLASEI fFBPASE) >ail99693|DirllS16582 fructose- 
bisphosphatase {EC 3.1 .3.1 1) precursor, chloroplast - Arabidopsis thaliana 
>niM i949lpmhiCAA41 1541 fX58148) fructose-bisDhosphatase [Arabidopsis 
thalianal 1 pnoth ~ 417 






T Pkc Phosnho Site^20-22) 


ZOO 


ZUZOZDO 


Tvr Phosnho Sitef469-475) 


264 


2028264 


3' 4E-60 >gi|6358806|gb|AAF07386.1|AC010675_9 (AC010675) peptide 

tran<5nnrtpr TArabidonsis thalianal Lenath — 644 


265 


2028265 


5' Tyr Phospho Site(290-297) 




9n9ft9fiR 


H' Tvr PhosDho Sitef359-367) 


267 


2028267 


5' Tyr Phospho Site(357-365) 


268 


2028268 


5' 1 E-1 1 >gi|1 651 723|dbjlBAA1 6651 1 (D90899) phosphoglycerate mutase 
[Synechocvstis sp.] Length = 349 


269 


2028269 


1 E-1 01 >emb|CAB52675.1 1 (AJ010971) glucose-6-phosphate 1- 
dehydroqenase fArabidopsis thaliana] Length = 515 


U 




Tvr Phnqnho 55itfir27 5-283^ 


271 


2028271 


3E-20 >gi|3252979 (AF068920) Ras-binding protein bUK-o [Homo 
sapiens] >gi|3293320 (AF054828) leucine-rich repeat protein SHOC-2 [Homo 
sapiens] Length = 582 


272 


2028272 


Pkc_Phospho_Site(1 37-1 39) 


273 


2028273 


3E-44 >dbj|BAA0631 1 1 (D30622) novel serine/threonine protein kinase 
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[Arabidopsis thaliana] Length = 421 


274 


2028274 


Rgd(348-350) 


275 


2028275 


6E-39 >splP81 291 |LE22_METJA 3-ISOPROPYLMALATE DEHYDRATASE 
1 APriF «ll IRI IMIT n^inPROPYLMALATE ISOMERASEt (ALPHA-IPM 
|c;nMFRA9F^ nPMh >oil2127740lDirllC64362 aconitate hvdratase (EC 4.2.1 .3) - 
Mpthanococcus iannaschii >ail1591201 (U67499) 3-isopropylmalate dehydratase 
(leuC) [Methanococcus jannaschiil Length = 424 


276 


2028276 


5E-83 >gi|41 06395 (AF073744) raffinose synthase [Cucumis sativus] 


277 


2028277 


Pkc Phospho Site(19-21) 


278 




T\/r Phocnhn Qita/^A1 -^4R^ 


279 


2028279 


5E-39 >pdb|1 SOX|A Chain A, Sulfite Oxidase From Chicken Liver 
>gi|321261 1 |pdb|1SOX|B Chain B, Sulfite Oxidase From Chicken Liver Length = 
466 


280 


2028280 


6E-45 >pir|lS20940 DNA-binding protein - Arabidopsis thaliana Length = 


281 


2028281 


Tyr_Phospho_Slte(452-460) 


282 


2028282 


1 E-29 >gi|473874 (U08285) a membrane-associated salt-inducibte 
protein [Nicotiana tabacum] Length = 435 


283 


2028283 


Tyr_Phospho_Site(278-285) 


284 




QP Q7 ^niM 'M'kMA n R4744^ r\/tnrhrnmp P-45n fPhalaenoDsis SD 
^hybrid SM91081 Length = 426 


285 


2028285 


8E-89 >gi|1 935914 (U77347) lethal leaf-spot 1 homolog [Arabidopsis 
thalianal Length = 539 


286 


2028286 


3' 2E-25 >gi|2323344 (AF014806) alpha-giucosidase 1 [Arabidopsis 
thaliana] Length = 902 


287 


2028287 


3' Pkc_Phospho_Site(97-99) 


288 


2028288 


3' 3E-12 >gi|6320470|ref|NP 01 0550.1 lAKRI] Ankyrin repeat-containing 
protein; Akr1p >gjl728821|sp|P39010iAKR1_YEAST ANKYRIN REPEAT- 

CONTAIN INCj rKU 1 blN Ar\K1 >gi|DZDUy4|pir||o^ooz i Mrs.r\ i pruieifi yecit)i 

(Saccharomyces cerevisiae) >gj|466522 (L31407) ankyrin repeat-containing 
protein [Saccharomyces cerevisiae] >gi|1230637 (U51030) Akrip: Ankyrin 
repeat-containing protein (Swiss Prot. accession number P39010). 
[Saccharomyces cerevisiae] >gi|1586336|prf||2203403A ankyrin repeat- 
nnntaininn nrotpin T^pprhpirnrnvrp^ cprevlsiael Lenoth = 764 


289 


2028289 


3' Pkc_Phospho_Site(40-42) 


290 


2028290 


3' 4E-43 >gi|47261 1 81gb|AAD2831 8.1 |AC006436_9 (AC006436) somatic 
fimhr\/nn*anp<;i<; rprpntnr-llkp klna^ip FArabldoDsis thalianal Lenoth = 520 


291 


2028291 


5' Pkc_Phospho_Site(42-44) 


292 


2028292 


5' 3E-15 >gi|4539386|emb|CAB37452.1 1 (AL035526) extensin-like protein 
[Arabidopsis thalianal Length = 839 


293 


2028293 


5' Tyr_Phospho_Site(720-727) 


294 


2028294 


5' 2E-55 >gi|2129597|pir||S71217 glutamate dehydrogenase 1 - Arabidopsis 
thaliana >gi| luyoyuU ^Uof i f *) giuiamaie aenyuroyciicabc i [Miauiuu^isio 
thaliana] >gi| 1293095 (U53527) glutamate dehydrogenase 1 [Arabidopsis 
thalianal Lenoth = 41 1 


295 


2028295 


3E-30 >gb|AAD34702.1 |AC006341_30 (AC006341 ) Similar to gb|D14414 Indoie- 
3-acetic acid induced protein from Vigna radiata. ESTs gb|AA712892 and 
qb|Z17613 come from this gene. [Arabidopsis thaliana] Length = 147 


296 


2028296 


3E-1 1 >pir|lS47536 SWH1 protein (version 2) - yeast (Saccharomyces 
cerevisiae) >qil402658|emb|CAA52646| (X74552) SWH1 [Saccharomyces 
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cerevisiae] >gi|1090523|prf||2019253A oxysterol-binding protein-like protein 
fSaccharomyces cerevisiael Lengtli = 1190 


C.<J I 


2028297 


Tyr Phospho Site{8-16) 




2028298 


Tyr Phosplio Site{397-404) 


299 


2028299 


Pl<c Phospiio Sited 5-1 7) 


ouu 


£X)£.00\J\J 


Pkn Phnsnho 8116^221-223) 


OU 1 




6E-43 >DirllS71229 RNA-binding protein 37 - Arabidopsis thaliana 
>qi|1174153 {U44134) RNA-binding protein [Arabidopsis thaliana] Length = 336 




2028302 


Tyr Phospho Site(81 7-824) 


303 


2028303 


1E-90 >emb|CAB43971.1| (AL078579) beta-glucosidase [Arabidopsis 
thaliana] Length = 517 


304 


2028304 


Pl^c Phospho Site(43-45) 




zuzoouo 


T\/r Phn«;nhn ?iitpf45-51^ 


306 


2028306 


3' Tyr Phospho Site(208-215) 


oU/ 




'>P-'V\ >nil^77fi'S7? fAC005388) ESTs ablR65052, qb|AA712146, 
nhlH7fiW nblH76282 abiAA650771 ablH76287 qb|AA650887, gb|N37383, 
nhl7?g721 and ablZ29722 come from this aene. [Arabidopsis thaliana] Length = 
285 


OuO 




3* 7E-1 1 >gi|3560235|emb|CAA20703.1 1 (AL031530) hypothetical zinc finger 
protein [Schlzosaccharomyces pombe] Length = 680 


309 


2028309 


5' Pkc Phospho Site(39-41) 


O 1 u 




5' Tyr Phospho Site(310-317) 


O 1 1 




5' Pkc Phospho Site(84-86) 


312 


2028312 


5' Pkc Phospho Sited 6-1 8) 


313 


2028313 


Pkc Phospho Site(20-22) 






IF 19 >nili'S4fiQ2 fM73322^ cellulase E-4 nThermomonospora fuscal 


315 


2028315 


Pkc Phospho Site(92-94) 


olb 


ZUZOOT D 


oc Rr» inii94R9ft94 /AFn00fi57^ similar to Jun activation domain bindina 
protein [Arabidopsis thaliana] >gi|2791885 (AF042334) JAB1 [Arabidopsis 
thaliana] Length = 357 


317 


2028317 


Tyr Phospho Site(725-733) 


318 


2Uzoo1o 


Ap: RA \ -i-nhi A AHARft'^T ilAPlRfi'^'S1 1 ^AF16fi351^ alanine'olvoxvlate 
aminotransferase 2 homolog [Arabidopsis thaliana] Length = 476 


319 


2028319 


6E-43 >sp|P42731 |PAB2 ARATH POLYADENYLATE-BINDING PROTEIN 2 
(POLY(A) BINDING PROTEIN 2) (PABP 2) >gi|304109 {LI 941 8) poly{A)-binding 
protein [Arabidopsis thaliana] >gi|291 1 051 |emb|CAA1 7561 1 (AL021 961) poly(A)- 
binding protein [ 


320 


2028320 


Pkc_Phospho_Site(41 -43) 


321 


2028321 


2E-20 >dbj|BAA25989| (D89051 ) ERD6 protein [Arabidopsis thaliana] 

1 onrtth — AQf\ 


322 


2028322 


6E-43 >sp|Q42208|RL7_ARATH 60S RIBOSOIVIAL PROTEIN L7 

>nii'^9i 9ft7Q /APnn4nn'i^ ribosomal orotein L7 TArabidoDsis thalianal Lenqth = 

247 


323 


2028323 


4E-1 1 >emb|CAB53646.1 1 (AL1 10123) multidrug resistance protein/P- 
glycoprotein-like [Arabidopsis thaliana] Length = 1222 


324 


2028324 


3' 4E-18 >gi|3941528 (AF062918) transcription factor [Arabidopsis 
thaliana] Lenqth = 335 


325 


2028325 


3' Tyr_Phospho_Site(808-81 5) 


326 


2028326 


3* 1E-19>gi|1694711|emblCAA70769| (Y09581)FRO1 [Arabidopsis thaliana] 
Length = 704 


327 


2028327 


3' 8E-12 >gil2894597|emblGAA17131.1| (AL021889) bHLH protein-like 
[Arabidopsis thaliana] Length = 589 



55 



328 


2028328 


3' 3E-28>gi|461812|sp|Q05047|CP72 CAIRO CYTOCHROME P450 72 A1 
(CYPLXXII) (PROBABLE GERANIOL-10-HYDROXYI_ASE) (GE10H) >gil167484 
(L10081) Cytochrome P-450 protein [Catharanthus roseus] 
>qil445604|prf||1909351 A cytochrome P450 [Catharanthus roseusl Length = 524 


329 


2028329 


3' 2E-15>gi|400972|sp|P30986|RETO_ESCCA RETICULINE OXIDASE 
PRECURSOR (BERBERINE-BRIDGE-FORMING ENZYME) (BBE) 
(TETRAHYDROPROTOBERBERINE SYNTHASE) >gi|99506|pir||A41533 
reticuiine oxidase (EC 1 .5.3.9) precursor - California poppy >gi|2391 1 0|bbs|65555 
(S65550) (SVreticuline:oxvqen oxidoreductas 


330 


2028330 


5' Tyr_Phospho_Site(71-79) 


331 


2028331 


5' 2E-69>gil123340|sp|P14891|HMD1 ARATH 3-HYDROXY-3- 
METHYLGLUTARYL-COENZYME A REDUCTASE 1 (HMG-COA REDUCTASE 
^\ >niiQQ7i4lnirllA32107 hvdroxvmethvlalutarvi-CoA reductase 

^NAOPH^ (EC 1 1 1 34^1 - ArabidODsis thaliana >qil16336lemb|CAA33139| 
(X15032) hydroxy methytglutaryl CoA reductase 






5* 3E-19 >gi|5731257|gb|AAD48836.1|AF165924_1 (AF 165924) auxin- 
induced basic helix-loop-helix transcription factor [Gossypium hirsutum] Length = 
314 


333 


2028333 


5' Pkc Phospho Site(22-24) 






Tvr Phnsnho SiteM7-24'i 


335 


2028335 


Tyr Phospho Sited 196-1204) 


336 


2028336 


1 E-53 >sp|Qd8467|KC21 ARATH CASEIN KINASE II, ALPHA CHAIN 1 (CK 
11) >gi|419752|pir||S31098 casein kinase II (EC 2.7.1 .-) alpha-type chain (clone 
ATCKA1) - Arabidopsis thaliana >gi|391603|dbj|BAA01090| (D10246) casein 

Unooo II ootdK/tir' cnhi init TArahiflnn*;]*? thaliana! Lenath — 33 

Kinase ll OalaiyUU oUUUIML [/-Mc*UILIw|JOiO LI iciiicii ictj 1-^1 ivjii 1 WW 


337 


2028337 


4E-28 >sp|024164|PPOM TOBAC PROTOPORPHYRINOGEN OXIDASE, 
MITOCHONDRIAL (PPO II) (PROTOPORPHYRINOGEN IX OXIDASE ISOZYME 
ii\ /PPY \\\ >nii'?'^7n'^'^'iif»mbiCAA73866l fYl 34661 DrotoDorDhvrinoQen oxidase 

FNirntiana tabacuml >ail3929920ldbiiBAA34712| (AB020500) mitochondrial 
protoporphyrino 


338 


2028338 


2E-50 >emblCAA63010| (X91917) LEA D113 honnologue type2 
[Arabidopsis thaliana] >gil3668076 (AC004667) LEA D1 13 type2 protein 
[Arabidopsis thaliana] Length = 97 


339 


2028339 


4E-12 >gi|2224915 (U95968) beta-expansin fOryza sativa] Length == 261 


340 


2028340 


Tyr_Phospho„Site(494-501 ) 


o41 


ZU/c)04 1 


'^F 1ft >c;nlPMQ?filMY01 LYCES MYO-INOSITOL-1(OR 4)- 
MONOPHOSPHATASE 1 (IMP 1) (INOSITOL MONOPHOSPHATASE 1) 
>gi|1 098977 (U39444) myo-inositol monophosphatase 1 [Lycopersicon 
esculentum] Length = 273 


342 


2028342 


Pkc Phospho Site(8-10) 


o4o 




PWr Phnqnho 5^ltpM3-15^ 


'iA A 

o44 


iUZ004'f 


Tvr Phnc;nho Sitpf665-6721 


345 


2028345 


9E-65 >emb|CAA74028.1 1 (Y13694) multicatalytic endopeptidase complex, 
nrntpa<^^mp nrpnursor beta subunit [Arabidopsis thaliana] 
>gi|28275251emb|CAA1 6533.1 1 (AL021633) multicatalytic endopeptidase 
rnmnlpY nrntpa^omp Drecu 


346 


2028346 


Tyr Phospho Site(31 4-321) 


O /I "7 

o47 




-IP >nii'^f^Anift'^ fACn04122> Hiohlv Similar to branched-chain amino 
acid aminotransferase [Arabidopsis thaliana] Length = 318 


348 


2028348 


2E-1 1 >emb|CAB1 0522.1 1 (Z97343) DNA-binding protein homolog 
[Arabidopsis thaliana] Length = 459 


349 


2028349 


7E-1 5 >emb|CAA690721 {Y07765) S-adenosylmethionine decarboxylase 
[Arabidopsis thaliana] Length = 51 


350 


2028350 


2E-27 >sp|P19954|RR30 SPIOL 30S RIBOSOMAL PROTEIN S30, 
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CHLOROPLAST PRECURSOR {CS-S5) (CS5) (S22) (RIBOSOMAL PROTEIN 1) 
(PSRP-1) >gi|279640|pir||R3SPS5 ribosomal protein CS-S22 precursor, 
chloroplast - spinach >gi|12316|emb|CAA41960| (X59270) chloroplast ribosomal 
protein S22 [Spinacia oleracea] >gil18031|emb|CAA33403| (X15344) spinach 
S22 r-protein fSpinacia oleracea] Length = 302 


351 


2028351 


3' Tyr_Phospho_Site(344-350) 


352 


2028352 


5' 3E-65>gi|3164126|dbj|BAA28531| (D78598) cytochrome P450 
monooxygenase [Arabidopsis thaliana] >gi|5262761 |emb|CAB45909.1 1 
(AL080283) cytochrome P450 monooxygenase [Arabidopsis thaliana] Length = 
499 


oOo 




'S' 1E-76 >ail5915830ISDlQ96514|C7B7 ARATH CYTOCHROME P450 71 B7 

>gi|1523796|emb|CAA66458| (X97864) cytochrome P450 [Arabidopsis thaliana] 
>gi|4850394|gb|AAD31064.1|AC007357J3 (AC007357) Identical to gb|X97864 
cytochrome P450 from Arabidopsis thaliana and is a member of the PF|00067 
Oytonhrome 


354 


2028354 


5' Tyr Phospho Site(209-216) 






5' Tyr Phospho Site(823-831) 


oou 




5' Pkc Phospho Site(6-8) 




9028357 


5* 3E-45 >gi|5541691 |emblCAB51 197.1 1 (AL096859) glucuronosyl 
transf erase-like protein (fragment) [Arabidopsis thaliana] Length = 271 


358 


2028358 


4E-39 >gi|3201 623 (AC004669) shaggy-like kinase dzeta [Arabidopsis 
thaliana] Length = 412 




2028359 


Pkc Phospho Site(2-4) 


360 


2028360 


Tyr Phospho Site(638-645) 


OD 1 




Tvr Phosnho Site^297-304) 


362 


2028362 


5E-83 >gi|22751 96 (AC002337) water stress-induced protein, WSI76 
isolog [Arabidopsis thaliana] >gi|4630746|gb|AAD26596.1lAC007236J 
^Arn^79'^fi^ watpr <itrp<5s-induced orotein [ArabidoDSis thaliana] Length = 344 


363 


2028363 


4E-76 ) >gb|AAD201 1 3| (AC006304) proline iminopeptidase [Arabidopsis 

thaliana! ! Pnnth = '^?Q 

LI tClllCfl iCIJ {_C7M^L1 1 \J£»iJ 


364 


2028364 


1 E-48 >emb|CAA66964| (X98320) peroxidase [Arabidopsis thaliana] 
>gi|1429215|emb|CAA67310| (X98774) peroxidase ATP6a [Arabidopsis thaliana] 
Length = 336 


365 


2028365 


3E-31 >gb|AAB95298.1| {AC003105) beta-ketoacyl-CoA synthase 
[Arabidopsis thaliana! Length = 509 


366 


2028366 


Tyr_Phospho_Site(370-378) 


OD/ 


9n9fi'^fi7 


1 E-39 >emb|CAA65384| (X96539) malate dehydrogenase 
[Mesembrvanthemum crvstallinum] Length = 332 


ODD 


9098^^68 


T Tvr Phnsnho Sited 76-1 83) 


ooy 


9(198*^80 

Z.U^OOUv7 


9: PVr. Phnsnho SiteMO-12) 


O / VJ 


9n?8'^7n 


3' 2E-52 >gi|2739376 (AC002505) permease [Arabidopsis thaliana] 
Length = 551 


O / I 




3' 2E-53 >gi|231 601 6 (U92650) MRP-like ABC transporter 
rArphrHnnsiR thaliana! Lenath = 1515 


372 


2028372 


3' Tyr Phospho Site(41 4-420) 


Oi o 


0098^^7'^ 


5* Tyr Phospho Site(10-17) 


374 


2028374 


5' 5E-77 >gi|2129553lpirl|S71774 calcium-dependent protein kinase 6 - 
Arabidopsis thaliana Length = 529 


375 


2028375 


5' Pkc Phospho Site{53-55) 


376 


2028376 


5' 1 E-42 >gi|1495768|emb|CAA92823| (Z68506) chloroplast inner envelope 
protein, 110 kD (IEP110) [Pisum sativum] Length = 996 


377 


2028377 


5' 2E-75>gi|3914425|sp|023717lPRCE ARATH PROTEASOME EPSILON 
CHAIN PRECURSOR (MACROPAIN EPSILON CHAIN) (MULTICATALYTIC 
ENDOPEPTIDASE COMPLEX EPSILON CHAIN) >gi|251 1596|emb|CAA74029.1 1 
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(Y13695) multicatalytic endopeptidase complex, proteasome precursor, beta 
subunit [Arabidopsis thalianal >gi| 


378 


2028378 


3E-48 ) >gi|2088650 (AF0021 09) peroxisomal ATP/ADP earner protein 
isoloq [Arabidopsis thaliana] Length = 331 


379 


2028379 


Pkc_Phospho_Site(40-42) 


380 


2028380 


3E-1 6 >gb|AAD39612.1 |AC007454_1 1 (AC007454) Similar to gb|X92204 NAM 
gene product from Petunia hybrida. ESTs gblH36656 and gb|AA651216 come 
from this gene. [Arabidopsis thalianal Length = 557 


381 


2028381 


8E-79 >emb|CAA65051 1 (X95736) amino acid permease 6 [Arabidopsis 
thaliana] Length = 481 


382 


2028382 


Pkc_Phospho_Site(65-67) 


383 


2028383 


3E-18 >gb|AAD46412.1 |AF096262„1 (AF096262) ER6 protein [Lycopersicon 
esculentuml Length = 168 


384 


2028384 


1 E-81 >gi|28271 39 (AF0271 72) cellulose synthase catalytic subunit 
[Arabidopsis thaliana] >gj|4049343|emb|CAA22568.1| (AL034567) cellulose 
synthase catalytic subunit (RSW1) [Arabidopsis thaliana] Length = 1081 


385 


2028385 


Pkc_Phospho_Site(9-1 1) 


386 


2028386 


6E-13 >gil2342674 (AC000106) Similar to ATP-dependent CIp protease 
^nhlnQnQ1'^^ fst nhlN65461 comes from this cene. [Arabidopsis thalianal 
Length = 292 


387 


2028387 


7E-46>gbiAAD29776.1|AF074021_8 (AF074021) symbiosis-related protein 
[Arabidopsis thaliana] Length = 122 


388 


2028388 


4E-41 >dbj|BAA07555| (038552) The ha1 539 protein is related to 
cvclophilin. [Homo sapiens] Length = 645 


389 


2028389 


Tyr_Phospho_Slte(858-864) 




ZUZooyu 


-1 P AQ >nirll<=i7i 9fi^ fprritin - ArabidoDsis thaliana 

>nll1 246401 lemblCAA63932l (X94248) ferritin [Arabidopsis thaliana] Length = 

255 


0>3 1 




Tvr Phnsnho Sitef582-588) 




zuzooyz 


^' Pkr Phosnho Sitef34-361 


oyo 




T Tvr Phosnho Sitef231-239) 


oy4 




Pkr Phosnho Sitef31-33) 


one 


zuzooyo 


fiF-9S >nil?nQ8713 (U82977) pectinesterase [Citrus sinensis] 
Length = 510 


396 


2028396 


3' Tvr Phosoho Site(93-100) 


397 


2028397 


5' Tvr Phospho Site(287-293) 


398 


zUzooyo 


Pier- Phncnhn ^itp>^99-94^ 


399 


zuzooyy 


PU-r PhnQnhn ^itp^'^7-^Q'\ 

O rl\0 1 (lUo|JllU Ollcy^vJi sj\j } 


400 


2028400 


5' 2E-36 >gi|1170170|splP46602|HAT3 ARATH HOMEOBOX-LEUCINE 
ZIPPER PROTEIN HATS (HD-ZIP PROTEIN 3) >gi|549889 {U09338) homeobox 
protein [Arabidopsis thaliana] >gi|549890 (U09339) homeobox protein 
[Arabidopsis thalianal Length = 315 


401 


2028401 


Tyr Phospho Site(384-390) 


402 


2028402 


1 E-54 >sp|P43188|KADC_MAIZE ADENYLATE KINASE, CHLOROPLAST 

/A-rn ARjir* TO AMODLj/^ODi_jr\DVi AQP\ ^nllftOQftft'^lnirll^ARR'^A aripnvlatp kina<?p 
(ATP-AMr 1 KANoKrIUornUKTLAotij '=^gi|Dzyooo]pii Ijo'+ODOH- duciiyidLc; miiqoc 

(EC 2.7.4.3), chloroplast - maize >gi|31 14421 lpdb|1ZAK|A Chain A. Adenylate 
Kinase From IVIaize In Complex With The Inhibitor P1.P5-Bis(Adenosine-5^- 
bentaohosDhate (Ap5a) >qil3114422|pdb|1ZAK|B Chain B, Adenylate Kinase 
From Maize In Complex With The Inhibitor P1,P5-Bis(Adenosine-5'- 
)pentaphosphate (Ap5a) Length = 222 


403 


2028403 


1E-101 >sp|P54888|P5C2 ARATH DELTA 1-PYRR0LINE-5-CARB0XYLATE 
SYNTHETASE B (P5CS B) [INCLUDES: GLUTAMATE 5-KINASE (GAMMA- 
GLUTAMYL KINASE) (GK); GAMMA-GLUTAMYL PHOSPHATE REDUCTASE 
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(GPR) (GLUTAMATE-5-SEMIALDEHYDE DEHYDROGENASE) (GLUTAMYL- 
GAMMA-SEMIALDE... >gi|887388|emb|CAA60447| {X86778) pyrroline-5- 

carboxylate synthetase B [Arabidopsis thaliana] >gi|1669658|emb|CAA70527| 
(Y09355) pyrroline-5-carboxlyate synthetase [Arabidopsis thaliana] Length = 726 


404 


2028404 


Tyr_Phospho_Site(585-592) 






6E-40 >pir||HSWT4 histone H4 - wheat >gi|70773|pir||HSPM4 histone 
H4 - garden pea Length = 102 




2028406 


Tyr Phospho Site(329-336) 


407 


2028407 


Pkc Phospho Sited 17-1 19) 


408 


2028408 


3E-93 >gb|AAD1 69461 (AF1 06324) sodium proton exchanger Nhx1 
[Arabidopsis thaliana] Length = 538 






Tvr Phnsnho SitGf852-860) 


H 1 u 




Pkr Phn<;nhn 55itfif66-68^ 


H \ 1 


Z-U^Ot 1 1 


3' 2E-18 >gi|629728|pir||S46959 porin 1, 36K - potato 
>ail1076680lDirllC55364 porin (clone pPOM 36.1) - potato mitochondrion 
>gi|515358|emb|CAA56601| (X80388) 36kDa porin 1 [Solanum tuberosum] 
Length = 276 


1 z 


0098419 


1^' Tvr PhosDho Sitef330-337) 


413 


2028413 


3' Tyr Phospho Site(208-215) 


414 


2028414 


3* Pkc Phospho Site(55-57) 


4l0 


ilUZt54 1 O 


r IF-?'^ >nil2499535lsDl041364ISOT1 SPIOL 2-0X0 GLUTARATE/MAL ATE 
TRAN<5LOCATOR PRECURSOR >ail595681 (U13238) 2-oxoglutarate/malate 
translocator [Spinacia oleracea] Length = 569 


AIR 


909841 R 

ZWZ.OH- 1 O 


3' 1 E-10 >gi|99749|pir||S20918 probable serine/threonine-specific protein 
kinase ATPK64 (EC 2.7.1 .-) - Arabidopsis thaliana >gi|2178431dbjlBAA01731 1 
(D10937) protein kinase [Arabidopsis thaliana] Length = 498 


AA7 
H 1 1 


9028417 


3' Tyr Phospho Site(693-701) 


418 


2028418 


3' Pkc Phospho Sited 15-1 17) 


419 


2028419 


5' Pkc Phospho Site(2-4) 


420 


2028420 


5' 6E-77 >gi|5730139|emblCAB52472.1| {AJ243705) ferredoxin-NADP+ 
roHi irtaQti FArahirlnn*;!*; thfllianal Lenoth = 360 




9n9ft491 




422 


2028422 


8E-13 >gb|AAD41415.1|AC007727_4 (AC007727) Contains similarity to 
nhii 107707 pniriprmal nrowth factor receotor substrate (eDS15) from Homo 
<?anipns and contains 2 PFI00036 EF hand domains. ESTs gblT44428 and 
qb|AA395440 come from this qene. [Arabidop... Length = 1 181 


423 


2028423 


Tvr Phospho Site(412-419) 


424 


2028424 


9E-72 >gb|AAD32285.1|AC006533_9 (AC006533) poly(ADP-ribose) 
nivrnhvdrolase FArabidoDsis thalianal Lenqth = 997 


AOR 
4Z0 


9^9ft49^^ 


Tvr Phncir^ho Site^77-84) 


4ZO 


9n9ft49ft 


Tvr Phoc^nho Sitef800-807) 


427 


2028427 


1E-22 >ref|NP 004658.1 1PHERC2| hect domain and RLD 2 
>n]!407<5809lablAAD08657 11 (AF071172) HERC2 [Homo sapiens] Lenqth = 
4834 


4ZO 


9098498 




429 


2028429 


Tyr Phospho Site(399-406) 


4oU 


9n9ft4'^0 


fiF-*S4 >nil97'^Q'^fi8 fAC002505) cvclin-llke protein [Arabidopsis thaliana] 
Length = 361 


431 


2028431 


6E-46 >gb|AAD21729.1 1 (AC006931) citrate synthase [Arabidopsis 
thaliana] Length = 509 


432 


2028432 


2E-45 >gi|2459448 {AC002332) cinnamoyl-CoA reductase [Arabidopsis 
thaliana] Length = 321 


433 


2028433 


1 E-27 >gb|AAD39990.1 |AF1 50083J (AF1 50083) small zinc finger-like protein 
[Arabidopsis thaliana] Length = 77 



59 



434 


2028434 


2E-44 >gi|28291 33 (AF043351 ) adenosine-5'-phosphosulfate-kinase 
[Arabidopsis thaliana] >gi|4490745|emb|CAB38907.1| (AL035708) adenosine-5'- 
phosphosulfate-kinase [Arabidopsis thaliana] Length = 293 


435 


2028435 


Pkc_Phospho_Site(21-23) 


436 


2028436 


2E-47 >dbj|BAA77358.1 1 (AB020023) DNA-binding protein NtWRKYS 
fNicotiana tabacum] Length = 328 


437 


2028437 


Pkc Phospho Site(41-43) 


4*^8 

*tOO 


2028438 


3' Tyr Phospho Site(28-35) 


439 


2028439 


3' Tyr Phospho Site(210-217) 


440 


2028440 


3' 5E-18 >gi|2827665lemb|CAA1 6619.11 (AL021637) vacuolar sorting 
receptor-like protein [Arabidopsis thaliana] Length = 626 


441 


2028441 


3^ 3E-25 >gi|1419090|emb|CAA64422| (X94968) 37kDa chloroplast inner 
envelope membrane polypeptide precursor [Nicotiana tabacum] Length = 335 


442 


2028442 


3' Tyr_Phospho__Site(681 -688) 


443 


2028443 


3' 8E-69 >gil5921663|gb|AAD56290.1|AF162279J (API 62279) 10- 
formyltetrahydrofolate synthetase [Arabidopsis thaliana] Length = 634 


444 


2028444 


5' Tyr__Phospho_Site(422-428) 


y1 ^ K 


ZUZ0440 


7F-'S'^ >nlRQ14097lsDlO49071IMYOP MESCR MYO-INOSITOL-1(OR4)- 
MONOPHOSPHATASE (IMP) (INOSITOL MONOPHOSPHATASE) >gi|2708322 
(AF037220) inositol monophosphatase [Mesembryanthemum crystallinum] 
1 ftnnth = 970 


446 


2028446 


5' 2E-26 >gi|2921323igb|AAC04713.1 1 (AF0341 12) beta-1 ,3-giucanase 7 
rrtlvninfi max! Lfinath = 245 


447 


2028447 


5' Tyr Phospho Sited 02-1 09) 






Pkr Phosnho Sited 7-1 9) 


449 


2028449 


Tyr Phospho Site(658-664) 


40U 


ZUZ040U 


1 F-9'^ >pmhinAA18QQ1 1 (AL023518) transDort protein 
r5^rhi7n<;annharomvces Dombel Lenath = 397 


*+o 1 




1 E-1 06 >gi|2737926 (U77673) fimbrin-like protein AtFim2 [Arabidopsis 
thaliana] Length = 456 


452 


2028452 


4E-84 >gi|3643604 (AC005395) receptor-like protein kinase 
[Arabidopsis thaliana] Length - 960 


/IRQ 

40o 




Pkr Phr»c;nhn 9itp^7-Q) 

r r t lOopi lU OHC\^ f } 


454 


2028454 


Tyr Phospho Sited 156-1 162) 


455 


onoQ>* i^K 
zUzo4ob 


or 7n ■j.niiAfiQft'^oi l\ l7Q1fin^ HMG-CoA svnthase FArabidoDsis thaliana! 
>gi|4098523 (U79161) HMG-CoA synthase [Arabidopsis thaliana] 
>nii'^nn9Ri7lpmhinAB44320 11 (AL078606) hvdroxvmethvlalutarvl-CoA synthase 
[Arabidopsis thaliana] Length = 461 


40D 


ZUZ040D 


'^F-79 >nil9'SR'^1 1 1 (AC002387) dihvdrodiDicolinate synthase 
[Arabidopsis thaliana] Length = 365 


40 f 




QF-7Q \ >emblCAA35887l (X51514) precursor acetolactate synthase (670 
AA) FArabidoDsIs thaliana! Lenath = 670 






4E-86 ) >db]|BAA84380.1 1 {AP000423) PSIl D2 protein [Arabidopsis 
thalianfll 1 finnth = 353 


459 


2028459 


2E-67>emb|CAA76758.1| (Y17386) In2.1 protein [Triticum aestivum] 
1 pnnth = 943 


460 


2028460 


Pkc Phospho Site{45-47) 


461 


2028461 


Tyr Phospho Site(349-357) 


462 


2028462 


Tyr Phospho Site(303-310) 


463 


2028463 


3^ 1E-28 >gi|4106340|gb|AAD02810| (AF062396) protein phosphatase 2A 
regulatory subunit isoform B' delta [Arabidopsis thaliana] Length = 477 


464 


2028464 


3' 5E-41 >gi|4185133 {AG005724) zinc finger protein [Arabidopsis 
thaliana] Length = 181 



60 



465 


2028465 


3' 1 E-44 >gil4678357|emb|CAB41 167.1 1 {AL049659) cytochrome P450-like 
protein rArabidopsis thaliana] Length = 490 


466 


2028466 


5' Pkc__Phospho_Site(82-84) 


467 


2028467 


5' 1E-31 >gi|2500185|sp|Q23862|RACE_DlCDl RAS-RELATED PROTEIN 
RACE >gil1 373067 (U41222) RacE [Dictyosteiium discoideum] Length = 223 


4D0 




8E-74 >nil4587685lablAAD25855 1IAC007197 8 (AC007197) 
methylmalonate semi-aldehyde dehydrogenase [Arabidopsis thaliana] Length = 
607 


469 


2028469 


5' 2E-72 >gi|2494174|sp|Q42521 |DCE1_ARATH GLUTAMATE 
DECARBOXYLASE 1 (GAD 1) >gi|497979 (U10034) glutamate decarboxylase 
[Arabidopsis thaliana] Length = 502 


470 


2028470 


y 6E-75 >gi|5669047lgb|AAD46145.1 1 (AF081573) 19S proteasome 
regulatory complex subunit S6A [Arabidopsis thaliana] Length = 424 


471 


2028471 


5* Pkc„Phospho__Site(20-22) 


4f / 




R» '^F-71 >ail2501056lsDlQ39230ISYS ARATH SERYL-TRNA SYNTHETASE 
(SERINE— TRNA LiGASE) (SERRS) >gi|2129737|pirl|S71293 seryl-tRNA 
synthetase - Arabidopsis thaliana >gi|1359497|emb|CAA94388| (Z70313) seryl- 
tRNA Synthetase [Arabidopsis thaliana] Lenqtli = 451 


473 


2028473 


Pkc Phospho Site(49-51) 


474 


2028474 


Pkc Phospho Site(26-28) 


475 


2028475 


Tyr Phospho Site(21 7-225) 






AP Ri >fimhirAAfi7'^'Hfil (X98804'i oeroxldase ATP1 8a [Arabldopsis 
thaliana] Length = 346 


477 


2028477 


4E-34 >sp|P56286|IF2A SCHPO EUKARYOTIC TRANSLATION INITIATION 
FACTOR 2 ALPHA SUBUNIT {EIF-2-ALPHA) >gi|2706460|emb|CAA15918.1 1 
/Ai n91n4fi^ pukan/otir tran<;lation initiation factor 2 aloha subunit 
fSchizosaccharomyces pombe] Length = 306 


478 




1F 117 ><;nlPfS4fiOQICC48 ARATH CELL DIVISION CYCLE PROTEIN 48 

1 1 \ I '^oU \ \J^\j\J^ y<^\j^\J r\\\r^ III Ip^ I V 1 w i ■ ^ ■ *— • ■ x^.^ ■bill 

HOMOLOG >gi|2118115|pTrl|S60112cell division control protein CDC48 homolog 
- Arabidopsis thaliana >gii1019904 (U37587) cell division cycle protein 
[Arabidopsis thaliana] Length = 809 


479 


202o47y 


«yi ^cimhir'A A9'^nnfii ^Al n'H'S'^'Sfi^ mitochondrial uncouDlina orotein 
[Arabidopsis thaliana] Length = 313 


480 


2028480 


>.r<{\o/xri^'XArr /APnn4'>1 9^ Contains similaritv to ARI RINGfinaer 
protein gb|X98309 from Drosophila melanogaster, ESTs gb|T44383, gb|W43120, 
nhlNifi'Sftfift nhlH'=^fi01'^ ablAA042241 ablT76869 and qb|AA042359 come from 
this gene. [Arabidopsis thaliana] Length = 644 


4o1 


ZUZc540 I 


1P fi'^ >niifift979R H 40031^ S-adenosvl-L-methionlneitrans-caffeoyl- 
Pnpnyvmp A '^-O-methvltransf erase [ArabidODsis thaliana] Length = 212 






1E-22 >gi|3687243 {AC005169) ribosomal protein [Arabidopsis 
thaliana] Length = 68 






7E-42 >gi|34151 15 {AF081202) villin 2 [Arabidopsis thaliana] Length = 
976 


484 


2028484 


Tyr Phospho Site(204-211) 


485 


2028485 


Tyr Phospho Site(58-65) 


486 


2028486 


3' 8E-18 >gi|2804278|dbj|BAA24448| (AB003516) squalene epoxidase [Panax 
ginseng] Length = 539 


487 


or\oo /1 0"7 

202848 f 


BISPHOSPHOGLYCERATE-INDEPENDENT PHOSPHOGLYCERATE MUTASE 
(PHOSPHOGLYCEROMUTASE) (BPG-INDEPENDENT PGAM) (PGAM-I) 
>gi|2118335|pir||S60473 phosphoglycerate mutase (EC 5.4.2.1) - common ice 
plant >gi|602426 (U16021) phosphoglyceromutase [Mesembryanthemum 
crystallinum] Length = 559 


488 


2028488 


3' Wd Repeats(594-608) 



61 



489 


2028489 


3' Pkc Phospho Site(4-6) 






^i' fiE-fiq >Qil5738864lemblCAA63220 11 (X92486) isocitrate dehydrogenase 
{NAD+) [Solanum tuberosum] Length = 470 


491 


2028491 


5* 2E-74 >gil4927412lgb|AAD33097.1 |AF082525_1 (AF082525) homoserine 
kinase [Arabidopsis thaliana] Length = 370 


492 


2028492 


5' 1E-60 >gi|3128168 (AC004521) carboxyl-terminal peptidase 
[Arabidopsis thaliana] Length =415 






'S' Pkc Phosoho Sitef41-43) 


494 


2028494 


5' 3E-62 >gi|4006869|ennb|CAB16787.1 1 (Z99707) patatin-like protein 
[Arabidopsis thaliana] Length = 414 


495 


2028495 


3E-18 >gi|31 39079 (AF062537) cullin 3 [Homo sapiens] Length = 768 


496 


2028496 


Tyr_Phospho_Site(1 069-1 076) 






1 F R'^ >nhlAAr977n7 1 1 fAFn67789^ tSNARE AtTLG2a fArabidoDsis 
thaliana] Length = 322 


A no 
490 




QP '^R >ni!4nQiftnfi ^AFn^2'S85^ CONST ANS-iike orotein 2 FMalus 
domestical Length = 329 


499 


2028499 


9E-21 >gi|21 91 1 33 (AF007269) Arabidopsis thaliana G-box binding 
factor 2 (SP:P42774) [Arabidopsis thaliana] Length = 380 


500 


2028500 


4E-50 >gi|3650032 (AC005396) gibberellin-regulated protein GAST1- 

lIKo [MraDiaopSIb llldlldlloj l_c;liyui — 1 uo 


501 


2028501 


1 E-27 >sp|Q963301FLAV_ARATH FLAVONOL SYNTHASE (FLS) 
>gi|1628622 {U72631) flavonol synthase [Arabidopsis thaliana] >gi|1805305 

synthase [Arabidopsis thaliana] >gi|1 805309 (U84260) flavonol synthase 
[Arabidopsis thaliana] Length = 336 


502 


2028502 


4E-61 >gi|31 76686 (AC003671 ) Similar to high affinity potassium 
transponer, nMi\j proiein gu|uz^i7H-u iiuiii ooiiwaiiiinjiiiyuco uL^oiuciiiaiio. 
[Arabidopsis thaliana] Length = 764 


503 


2028503 


4E-61 >sp|P15455i12S1 ARATH 12S SEED STORAGE PROTEIN 
PRECURSOR >gi|81604|pir||S08509 cruciferin precursor (CRA1) - Arabidopsis 
thaliana >gi|166676 {M37247) 12S storage protein CRA1 [Arabidopsis thaliana] 
>gil808936|emb|CAA3249 


504 


2028504 


Tyr_Phospho_Site(1 3-20) 


1— r\ rr 

505 




-^i^ilonROi RA { Ar'C\C\'\f\AR\ ia<5mnnj^tp inriiifihlp nrntpin i^oioo 
[AraDiaopsis inaiidfidj Lciiyiit — <+r u 


506 


202oOUD 


RO \ ^cnlP'^9Qfi91MRI 9 ARATH MITRII A55F 2 >ail322548lDirllS31 969 
nitrllase (EC 3.5.5.1) - Arabidopsis thaliana >gi|22656|emb|CAA48377| (X68305) 
nitrilase II [Arabidopsis thaliana] >gi|508733 (U09958) nitrllase [Arabidopsis 
thaliana] Length = 339 


507 


2028507 


3' Pkc Phospho Site(41-43) 


508 


2028508 


3' Pkc Phospho Sited 1-13) 


509 


2028509 


AC Aa >.rtitR^ ft«n'i«lenlDA«A91 IPPJ^*^ ARATH PYTOnHRDMF P4^n 83A1 
rArohirlnneie thdlionol ^nil'^l RAI 9ft IHhi IRAAS>R'S'^2I (Dlf^^^^) CVtOChrOmS P450 

mnnonvx/npnaQP rArphir!nn<5i<; thalianal >ail4455306lemblCAB36841 .11 

(AL035528) cytochrome P450 monooxygenase {CYP83A1) [Arabidopsis thaliana] 


510 


2028510 


y Tyr Phospho Site(289-296) 


511 


202851 1 


3' Pkc Phospho Sited 65-1 67) 


512 


2028512 


5' Pkc Phospho Site(52-54) 


513 


2028513 


5' 2E-28 >gj|581 5233|gblAAD52608.1 |AF1 73378J (AF1 73378) 60S acidic 
ribosomal protein PO [Homo sapiens] Length = 239 


514 


2028514 


5' Tyr Phospho Sited 27-1 35) 



62 



515 


2028515 


9E-47 >emb|CAA05629.1 1 (AJ002597) membrane-associated salt-inducible 
nrntfiin likfi FArahirinnsjis thalianal Lenath = 428 


516 


2028516 


Tvr Phospho Site(648-655) 


517 


2028517 


2E-14 >gb|AAD17428| (AC006284) methyltransferase [Arabidopsis 
thaliana] Length = 619 


518 


2028518 


6E-23 >dbj|BAA1 89241 (D61395) gamma-VPE [Arabidopsis thaliana] 
Length = 490 


519 


2028519 


Pkc_Phospho_Site(79-8 1 ) 


520 


2028520 


2E-28 >sp|P43601 |YFJ1 YEAST HYPOTHETICAL 55.1 KD PROTEIN IN 
FAB1-PES4 INTERGENIC REGION >gi|1084743|pir||S56276 probable 
membrane protein YFR021w - yeast (Saccharomyces cerevisiae) 
i>niiR<ifi77RiHhiiRAAnci?fin 1 1 ^050617^ YFR021W FSaccharomvces cerevisiae! 

1 onnth — RCid 
Lcliyill — OVJU 






'^F ><5nlP4fi'^?'^ICLPA BRANA ATP-DEPENDENT CLP PROTEASE ATP- 
BINDING SUBUNIT CLPA PRECURSOR >gi|480969|pir||S37557 clpA protein - 
ranp ffraament^ >ail406311lemblCAA53077| (X75328) clpA [Brassica napus] 
Length = 874 


522 


2028522 


Tvr Phospho Sited 092-1 098) 


523 


2028523 


Tvr Phospho Site{727-735) 


524 


2028524 


6E-64 >gb|AAD30599.1 |AC007369_9 {AC007369) Similar to RNA helicases 
[Arabidopsis thalianal Length = 1166 


525 


2028525 


1 E-106 >pirl|S44943 sulfate adenylyltransferase (EC 2.7.7.4) - 
Ar'^KiH^noie fhoiiono '>nii9 1 9Q7zi'^lnirll*^fiftn94 ^iiilfatp adenvlvltransfepase (EC 
2.7.7.4) precursor (clone APS2) - Arabidopsis thaliana 
>gi|487404|embiCAA55799| (X79210) sulfate adenylyltransferase [Arabidopsis 
thaliana] >gi|1228104 (U06276) ATP sulfurylase [Arabidopsis thaliana] 
>gi| 1378028 (U40715) ATP sulfurylase precursor [Arabidopsis thaliana] 
>qi|1 575324 (U59737) ATP sulfurylase [Arabidopsis thaliana] Length = 476 


526 


2028526 


Tyr_Phospho_Site(1 807-1 814) 


527 


2028527 


8E-59 >gi|3249077 (AC004473) Similar to prunasin hydrolase precursor 
nhii mn9ni fmm Prtinii*; <;protina FSTs ablT21225 and ablAA586305 come from 
this gene. [Arabidopsis thaliana] Length = 439 


COO 

528 




IP RQ !>nhiAAr^AQQQ^ 1 !Acno72*iQ 8 (AC007259) clucose transporter 
[Arabidopsis thalianal Length = 522 


529 


2028529 


4E-75 >gb|AAB63620.1 1 (AC002343) trehalase precusor isolog 
[Arabidopsis thaliana] Length = 557 


530 


2028530 


2E-23 >gb|AAD21 456.1 1 (AC007017) transcription factor E2F5 
[Arabidopsis thaliana] Length = 532 


531 


2028531 


Tyr_Phospho_Site(654-660) 






4F-9R >nii94Q4l44 (AC002329') oredlcted leucine-rich protein 
rArahidnn<?i«5 thalianal Lenath = 526 

1 A^l Cli.rlV.ll w |tJO lO lMOIIOIIC*J ^V^Iiy*.!! v^v/ 


ooo 




1E-13 >emb|CAA22523| (AL034563) transcription initiation factor iif, beta 
subunit [Schizosaccharomyces pombe] Length = 307 






7F-19 >nhIAAn?7870 11AF134155 1 (API 341 55) RING finqer protein 
rArphiHnn«;i«; thalianal Lpnath = 1 70 


OoO 




Tvr Phosnho Sitef557-564) 


OoO 




Pkc Phosoho Sitef66-68) 


537 


2028537 


3" 4E-17 >gi|2244792|emblCAB1 0215.11 {Z97336) ankynn lii<e protein 
fArabidopsis thalianal Length = 936 


538 


2028538 


3' Pkc Phospho Site(74-76) 


539 


2028539 


3' Tyr Phospho Site(738-746) 


540 


2028540 


3' Pkc Phospho Site(78-80) 


541 


2028541 


3' Pkc Phospho Site{80-82) 



63 



542 


2028542 


5^ 2E-66 >gi|2459443 {AC002332) NAD(P)-clependent cholesterol 
dehydrogenase [Arabidopsis thaliana] Length = 480 


543 


2028543 


5' Tyr Phospho Site(543-551) 






Tvr Pho<?nho Site(245-252^ 


545 


2028545 


5' Pkc Phospho Site(1-3) 


554D 




^' RP-RQ >nil4'i?8Q9filpmhlCAB39662 1 1 ^AL049483^ DhoSDhatidvlserine 

decarboxylase [Arabidopsis thaliana] Length = 628 


547 


2028547 


5' 3E-22 >gi|1931650 (U95973) disease resistance protein RPM1 
isoloq [Arabidopsis thaliana] Length = 821 


548 


2028548 


1 E-168 >emb|CAB52 174.1 1 (AJ245407) syntaxin protein [Arabidopsis 
thaliana] Length = 341 


549 


2028549 


Pkc_Phospho_Site(20-22) 


CCA 


zuzooou 


>nhiAAn^nnn'^ iiAnnn7?'SQ 16 ^AC007259'i Unknown orotein 

rArahirlnnQiQ thali;?nal 1 pnnth = 308 
[/nl CiOlUOpolo LI ioiicii lOj 1— ciiyiii 


001 




^F-'^'^ >pmhiOAR'=iift'^4 11 fAJ243961^ contains eukarvotic orotein kinase 
Hnmain PFlOOOfiQ FOrvza satival Lenath = 844 


552 


2028552 


1 E-62 >pir||S58494 IAA7 protein - Arabidopsis thaliana >gi|97291 7 
i\ l1ft4nQ\ IAA7 fArabidoDsis thalianal Lenath = 243 


553 


2028553 


Pkc Phospho Site(14-16) 


554 


2028554 


Tyr Phospho Sited 64-1 71) 


555 


2028555 


Pkc Phospho Site(31-33) 


556 




Rxz Qo ^£imh!^ AR'^if^Oftl 11 ^A! 117919^\A/n Hnmian (^-bpta reneat orolein 

rQr'hi"7r4ccir'r'hamm\/r'OQ nnmhpl 1 pnnth — fiOft 
[our ll^UooLrLrl lai ol 1 lyLrOo jJVJIItUCJ i— diyiii ooo 


557 




RF Qft ^ >cnin9^ilQRIf^l OC ARATH HYDROXYACYLGLUTATHIONE 

HYnpni A<=iF rYTOPI A?5MIC fGLYOXALASE Ih fGLX IH 

>gi|1 924921 lemb|CAA696441 {Y08357) hydroxyacylgiutathione hydrolase 

rAmhirlon«5iQ thalianal 1 pnnth = 


558 


2028558 


Pkc Phospho Site(64-66) 


ooa 


zuzoooy 


T\/r Phncnhn *^itp^9'^n-9Rfi^ 


560 


2028560 


3' Tyr Phospho Sited 68-1 74) 


561 


2028561 


3' 2E-1 5 >gi|4539452|emb|CAB39932. 1 1 (AL049500) 
phosphoribosylanthranllate transferase [Arabidopsis thaliana] Length = 857 


562 


2028562 


3' 2E-19 >gi|2894378|emb|CAA74910.1| (Y14573) ribophorin 1 honnologue 
[Hordeum vulgare] Length = 473 


563 


2028563 


3' Pkc_Phospho_Site(39-41 ) 


5d4 


zUzood4 


AF Ifi !>nil'^Q1'^RQ4kninR7R9'SIIF9 AOIJAF TRANSLATION INITIATION 

FACTOR IF-2 >gi|2984268 (AE000769) initiation factor IF-2 [Aquifex aeolicus] 

1 pnnth = ftn'^ 


ooo 




Pkr Phn<5nhn ^itP^931-2'^'^\ 


566 


2028566 


5' Pkc Phospho Site(4-6) 


567 


2028567 


5' Tyr Phospho Sited 7-24) 


568 


^Uzoobo 


■1 c ftn \ •^.niioofti "1 no ^APnn9'^'^'^\ pnHnrhitina^p i^olon rArahidoDsis 
thaliana] Length = 281 


569 


2028569 


Pkc PhosDho Site(61-63) 


570 


2028570 


OCT HQ >.or^ID'3'5^ 7/ MOl IQF k'IMFQIW 1 IKF PROTFIN KIF4 
ot-iy "^SplroOT r4|l\lr4 IVIVJUoc rxilNCOiiN-LirxC. rr\V-/ 1 c-mn r\ir*t 

•>gi| lUoo4l A ]pir||AD4oUo rTllCrOLUDUie-dbbUOldLtiU lilULUl rs^li H- I MUUoC 

^nil'^R'^77'^tHhilRAAn91R7l ^ni9R4.R^ KIF4 FMii<5 mij<;c:ulusl Lenath = 1231 




2028571 


6E-70 >gi|3367517 (AC004392) Similar to F4I1 .26 beta-glucosidase 
gi|3128187from A. thaliana BAG gb|AC004521. ESTs gb|N97083, gb|F19868 
and gb|F15482 conne from this gene. [Arabidopsis thaliana] Length = 527 


572 


2028572 


Tyr Phospho Site(165-173) 


573 


2028573 


Tyr Phospho Site(162-169) 


574 


2028574 


4E-12 >emb|GAB38825.1 1 (AL035679) kinesin like protein [Arabidopsis 



64 







thalianal Length = 1121 


575 


2028575 


9E-84 >gi|1931645 (LI95973) Fe(ll) transporter isolog [Arabidopsis 
thaliana] Length = 374 


576 


2028576 


Tyr_Phospho_Site(299-305) 


1— -7-7 

511 


2028577 


Zt-OO >Sp|UDOOOO|oon AKA 1 rl oAlVIIVIM-oLU l AlVI T L n T UrxVJLAOC 

PRECURSOR (GAMMA-GLU-X GARBOXYPEPTIDASE) (CONJUGASE) (GH) 

^gi[olDyDOD ^Aruo/ I'fij garnrTid"yiuianiyi Myuiuiciot? [rviauiuupoio iiidticiMcij 

1 onnth — '^Qfi 

Lengin — ozo 


Of O 


ZUZOO/O 


IP '^Q >omhlPAR'^ft9Q4l (h\ (Y\^PiC\^\ fnrm?jmiria<5p-likp nmtpin 
[Arabidopsis thalianal Length = 432 


o/y 




o» op o^ 'i^nil'l 7n7nnft (\ I7ft79i\ '^n^ rihnQnmpI nrntpin icsnlnn 

TAraKiH/^ricic ihaliQnol I onnth — ^0*^ 
[r\l aUtUUpoio UlcillciMclJ 1— t7liyiM 0\J<j 


DOU 






00 1 






582 


2028582 


5' 9E-21 >gi|4263791|gb|AAD1 54511 (AC006068) receptor protein kinase 
[ArauiQopsis inaiiaridj Lefiyiii — out 


000 




T\/r Phncnhrt Qito^7i 0-71 ft^ 

1 yr rnospno oite^/ iv-i \o) 


Oc34 




T\/r Phncnhn Qitp/^fi'^9-R'^ft\ 
1 yi rllUo|JllU OUC^UO^ OOO/ 


585 


2028585 


Pkc Phospho Site(77-79) 


586 


2028586 


1 E-63 >emb|CAA1 1285.1 1 (AJ223384) 26S proteasome regulatory ATPase 

SUDUniX 1 UD Wu) [Ivlcll lUUUd ocXldJ Lciiyill — Oi7U 


5o7 




KKC Knospno oiTe\o-/ j 


coo 
000 


ZU^oooo 


Kga^oyo-oy / ; 


589 


2028589 


2E-23 >gi|21 49380 (U85036) syntaxin homolog [Arabidopsis thaliana] 

>gi|0^(3 1 U^D|ernD|L/AD 1 UOOO.ZJ ^Z.y/O^'fj byiUdXiri [AlaUIUUpolo ItlctMdndJ Uoliyui 

= 255 


590 


zOzooyU 


1 yr rnospno oiie(4yo-oui; 


591 


ziUzboy 1 


rKC rnospno olie^ 1 / o- 1 / 0) 






QCr OA -^iimhir* A 0070*^01 ^709770^ fariP9 rMvrnhartprii im tiihpm iln<5i<?l 

1 onnth — An*^ 

Loi lyin — HKJO 




zuzcsoyo 


RP OR '>nil'^'^9R^^Q^ ^AFnni'^9n^ PpnfiHp Chain Rp|pa<5P Fartnr 9 
rr^hlam\/HiQ traphnmatiQl 1 p^nnth — '^fiQ 


594 


2028594 


Tyr Phospho Sited 53-1 60) 


con 

595 


^Ozobyo 


1 yr rnospno oiie( 1 1 o- 1 z i j 


596 




I yr rnospno oiie^44o-4oo; 


597 


2028597 


Pkc Phospho Site(30-32) 


598 


2028598 


Rga(459-4Dl ) 


599 


2028599 


o ot-zl >gip f oZoi / ^UDZ/40; cyiosKcieiai proieiri [Aiduiuupoio 

tholifanol 1 Qnnth — 7ft9 

inaiianaj Lengin — /oz 


DUU 


ZUZoDUU 




601 


2028601 


3' 8E-62 >gi|4097505 (U63020) D1 protein [IVlagnolia pyramidata] 
Length = 353 


602 


2028602 


3' Tyr Phospho Site(1 18-126) 


603 


2028603 


o rKC rrtospno olte(oD-oo} 


604 


2028604 


5' Tyr Phospho Site(720-727) 


605 


2028605 


5' 2E-59 >gi|499301 lemb|CAA54383| {X771 16) ABI1 [Arabidopsis thaliana] 
>gi|549981 (U 12856) abscisic acid insensitive protein [Arabidopsis thaliana] 
>gi|4538937|emb|CAB39673.1| (AL049483) protein phosphatase ABM 
[Araoiaopsis inaManaj Lengin — 


606 


2028606 


5* 6E-50>gi|1709786|sp|P54904|PROC ARATH PYRROLiNE-5- 
CARBOXYLATE REDUCTASE (P5CR) {P5C REDUCTASE) 
>gi|541894|pir||JQ2334 pyrroline-5-carboxylate reductase (EC 1.5.1.2) - 
Arabidopsis thaliana >gi|166815 (M76538) pyrroline carboxylate reductase 
[Arabidopsis thaliana] >gi|1632776|emb|CAA70148| 



65 



607 


2028607 


5' Pkc_Phospho_Site(33-35) 


DUO 


2028608 


1 E-48 >gb|AAD1 0854.1 1 (U60135) serine/threonine protein phosphatase 
2A-3 catalytic subunit [Arabidopsis thaliana] Length = 352 




9n98RnQ 


Pkr Pho<5Dho SitG^56-58^ 


610 


2028610 


Tyr Phospho Site(62-68) 


O 1 1 


9n9ftfii 1 

^UZOO 1 1 


>pmhlPAR^9'Sfi1 11 ^Al lOQSIQ'i stromal ascorbate neroxldase 
[Arabidopsis thalianal Length = 372 


612 


2028612 


8E-51 ) >gi|3421 077 (AF043521 ) 20S proteasome subunit PAC1 
[Arabidopsis thaliana] Length = 250 


613 


2028613 


1E-82 >gi|3341695 (AC003672) thiamin pyrophosphokinase 
[Arabidopsis thaliana] Length = 263 


614 


2028614 


Pkc_Phospho„Site(2-4) 


O 1 o 


9n9Rfi1 
^CU^OO 1 o 


1F-47 >pmhirAA18212 11 CAL022198) SERINE CARBOXYPEPTIDASE II- 
likp nmfpin rArabirion<5i<; thalianal Lenath = 425 


616 


2028616 


Pkc Phospho Site(55-57) 


617 


2028617 


Pkc Phospho Sited 5-1 7) 


DiO 


ZUZoO 1 o 


'^P 97 xsnlPAQRQI IRl A ARATH PiOR RIRDJ^OMAL PROTEIN L4 (L^^ Lpnoth 

= 404 


Diy 


onojift'i Q 
^yj^oo 1 y 




620 


2Uzob2U 


OiZ-^f -^yl oZOZo 1 0 \Hl\^\J\JH I yjD ) Vaouuicii oui III ly 1 c;L>t5|Jiui -iii\c piuiciii 
FArahiHnnQici thfllij^nal >nin8in'S88 ^AnOD^'^QS'l vanuolar sortino receotor-like 
nrntpin rArahiHnn^i<5 thalianal 1 pnnth = 628 


\>£. 1 


9n9ftfi91 


9F-4'^ >pmbinAA2'^n2'^ 1 1 fAL035394^ ohosohatase like orotein 
[Arabidopsis thaliana] Length = 350 




9n9RR99 




623 


2028623 


5' Pkc Phospho Site(4-6) 


624 


2028624 


5' Pkc Phospho Site(9-11) 


625 


2028625 


5' Tyr Phospho Site(35-41) 


626 


2028620 


4b-o4 >gi|ooDyDoy|enriD|UAA/uoDu. 1 1 ^aluo \o>jh) poiassium irdiioporLtsi 
AtKT5p (AtKT5) [Arabidopsis thaliana] Length = 846 


627 


2028627 


5' 3E-74 >gi|585421 |sp|P38418|LOXC_ARATH LIPOXYGENASE, 
CHLOROPLAST PRECURSOR >gi|541879|pir||JQ2391 lipoxygenase (EC 
1 .13.1 1 .12) AtLox2 - Arabidopsis thaliana >gi|431258 (L23968) lipoxygenase 
[Arabidopsis thaliana] Length = 896 


628 


2028628 


Tyr_Phospho_Site(35-41 ) 


COO 


2U2obzy 


OC OQ '^nil9fi9'1 7Qft /AP^^^ft^^^ trancr-rintinnal rpniilatnr 
/itZ"^y -^ylliiDZ 1 f yo \r\CU\J\JO\J\J } LlailoLrlipUUIlal iCyUialUI 

[Methanobacterium thermoautotrophicum] Length = 151 


con 


ZUZCJDOU 


OCT ^niH 1 ft't {\ A1 944^ thinnin rArahirinnQiQ thfllianal 
^tZ~0\j •^yU 1 1 0 t OO 1 1 l_*T 1 *.HH-^ UIIUIIIII l^/AI aUIUL'lJolo ii iciiiai ictj 

*>niM RftRfl'^'^lnrfl 19904*^00 A thinnin rArphirinn<;i<5 ttialianfll 1 pnnth = ir^4 


UO i 




9P '^A >nhlAAPRQR1Q 11 ^AFn727'^fi^ hpta-nlijrn<;ida<?e [Pinus contortal 


632 


2028632 


7E-32 >gi|3599491 (AF085149) aminotransferase [Capsicum chinense] 

1 pnnth — ARQ 


633 


2028633 


Pkc Phospho Site(39-41) 


634 


2028634 


Pkc Phospho Site(23-25) 


boo 


zUzoboo 


1 yr Knospno oiiev^z-yyj 


636 


2028636 


1E-82 >emb|CAA1 1525.1 1 (AJ223635) transcription factor ilA large subunit 

FArphiHnn^i*; thalianal 1 pnnth = 37S 


637 


2028637 


7E-27 >pir||S30578 proteinase inhibitor 11 - Arabidopsis thaliana 
>gi|16427|emblCAA48892| (X69139) protease inhibitor II [Arabidopsis thaliana] 
>qil4038041 (AC005936) proteinase inhibitor II [Arabidopsis thaliana] Length = 77 


638 


2028638 


2E-68 >dbj|BAA1 9751 j (D85339) hydroxypyruvate reductase [Arabidopsis 
thaliana] Length = 386 



66 



639 


2028639 


ALLO-TA) (L-ALLO-THREONINE ACETALDEHYDE-LYASE) 
>gi|2190272|dbjlBAA20404| (D87890) L-allo-threonine aldolase [Aeromonas 
jandaei] Length = 338 


640 


2028640 


1E-12 >gi|31 93298 (AF069298) T14P8.17 gene product [Arabidopsis 
thaliana] Length = 154 


641 


2028641 


Tyr_Phospho_Site(21 3-220) 


642 


2028642 


4E-25 >sp|04yyf 2|uOAi! bKAJU o-AUtlNUoYLMt 1 nlUININc 
n^nADDOYVI AQP PP^P^I7V^^P 9 ^AnnMPTnp 9^ /'^AMFIP 9^ >nil9fifi94nfi 

/I IQnO'tftX Q Q/H^inrtex/I 1 mo+hinnino Hor*arhrkY\/laco TRraQQir*?! iiinr'P?il 1 pnnth " 
(UOUy lO) O'cUcnOSyi*"!— ~ri Icli IIUI III it? UcOai UUAyicloc [Diaooii^a juiiocoj L.diLjLii 

369 


o4o 


2U^od4o 


= 420 


644 


z:0zob44 


o 1 yr rnospno oiie^zyu-ouo; 


645 


2028645 


5' Tyr Phospho Site(29-36) 


646 


2028646 


5' 1E-73>gil5902365|gblAAD55467.1|AC009322_7 (AC009322) splicing 
factor Prp8 [Arabidopsis thaliana] Length = 2359 


647 


2028647 


5' 6E-37 >gi|1 542941 |emb|CAA55006| (X78116) Acetoacetyl-coenzyme A 
inioiase [rxapnanus bdiivuoj Lt?iiyiii — h*uo 


648 


2028648 


Rgci(383-385) 


649 


2028649 


4E-61 >gb|AAD45605.1|AF160729_1 (AF1 60729) isovaleryl-CoA- 
dehydrogenase precursor [Arabidopsis thaliana] Lengtli = 409 


650 


2028650 


Tyr_Phospho_Site(1 078-1 085) 


651 


2028651 


1 E-1 05 >emb|CAA1 6684| (AL021 684) oxoglutarate dehydrogenase - lil<e 
protein [Arabidopsis tlialiana] Length = 973 


652 


2028652 


2E-52 >sp|Q45223|HBD BRAJA 3-HYDROXYBUTYRYL-COA 
DEHYDROGENASE (BETA-HYDROXYBUTYRYL-COA DEHYDROGENASE) 
(BHBD) >gi|1209052 (U32229) HbdA [Bradyrhizobium japonicum] Length = 293 


653 


2028653 


Tyr_Phospho_Site(71 1-719) 


654 


2028654 


nCT -^rtMOVO QOOn / A/^^^^i^v^ 7n\ r*\r\r\cirr\r\\i\ C^c\ti. roHi list's oo FArohiHrtncic 

>gi o/ oooZU ^AL/UUO i (yj) Ginnamoyi ooa reuuoicibc [Miduiuupoib 
thaliana] Length = 303 


655 


2028655 


3E-67 ) >gi|2952433 (AF051 135) ubiquitin activating enzyme El 
[Arabidopsis thaliana] Length = 454 


656 


2028656 


Pkc Phospho Site(31-33) 


657 


2028657 


Tonb Dependent Rec 1(1-75) 


, 658 


2028658 


oE-o7 >SP|P49dd1 |GUrD_UKYoA OUA 1 UMtK UtL 1 A oUbUNI I ^UtLIIA- 
COAT PRO 1 tlN) (UhL 1 A-L/Or) (AKUnAIN; >gijno14U4y|enriD[oAAy lyu 1 j 
\LX>(\d\>£.} arcnain/oeiia-Lrvjr [^^ryza sativaj Lerigiri — o lo 


ceo 
659 


2Uzoboy 


yt:-4U ^QDJ|t5AA04cJt5D. I| \r\}r\J\j\JH£.o} yOTO [AiaUlUUpolo IDallaildJ Lcliyui — 

126 


660 


2028660 


3' Pkc Phospho Site{8-10) 


DOl 


202oDOl 


o rKC rnospno olie^ iZo- izo; 


coo 

662 




o z.inc ringer o^nz^oo-ooj 


663 


2028663 


5' Tvr Phospho Site{544-551 ) 


664 


2028664 


5 2E-54 >gi|2129727|pir||b71zzy KiNA-Dinaing protein of - AraDiaopsis 
tnaliana >gi|1 1 /4loo (U44io4; KINA-Dinaing protein [AraDiuopsis inaiianaj uengin 

- ooe 

— OOD 


665 


2028665 


Tyr Phospho Site(383-390) 


666 


2028665 


1 E-51 >emb|CAA1 7552| (AL021 961 ) Phosphoglycerate dehydrogenase - 
like protein [Arabidopsis thaliana] Length = 603 


667 


2028667 


Tyr Phospho Site(371-378) 


668 


2028668 


Pkc Phospho Site(25-27) 



67 



669 


2028669 


Pkc Phospho Site(14-16) 


670 


2028670 


3E-29 >gi|41 55557 (AE001 526) CYCLOPOCYCLOPROPANE FATTY 
ACID SYNTHASE [Helicobacter pylori J991 Length = 389 


671 


2028671 


2E-79 >emb|CAA09208| (AJ01 0469) RNA helicase [Arabidopsis thaliana] 
Length = 360 


672 


2028672 


Tyr_Phospho_Site{31 9-325) 


673 


2028673 


1E-113 >gb|AAD55787.1|AF181966_1 (AF181966) methylenetetrahydrofolate 
reductase MTHFR1 [Arabidopsis thaliana] Length = 592 


674 


2028674 


Tyr_Phospho_Site(1 1 57-1 1 63) 


Of O 




3E-69 ) >gi|3421090 (AF043525) 20S proteasome subunit PAE2 
FArabidoDsis thalianal Lenath = 237 


676 


2028676 


1 E-56 >gi|4063738 (AC005851 ) zinc finger protein [Arabidopsis 
thaNanal >ail4803961|qb|AAD29833.11AC006202 11 (AC006202) unl<nown 
protein [Arabidopsis tlialiana] Length = 284 


677 


2028677 


Pkc Phospho Site(22-24) 


678 


2028678 


Tyr Phospho Site{174-180) 


R7C1 




4E-43 >embiCAA47807l (X67421) extA [Arabidopsis thalianal Length = 
127 


680 


2028680 


3' Tyr Phospho Sited 95-202) 


681 


2028681 


3' 4E-14>gi|120532|sp|P19976|FRI_SOYBN FERRITIN PRECURSOR (SOF- 
"^^^ >nilRl773lDirllA40992 ferritin orecursor - soybean >qi|1 69953 (M64337) 
ferritin light chain [Glycine max] Length = 250 






Rndr36-38'l 


683 


2028683 


3' 4E-35 >gil3047064 (AF058825) contains similarity to peptidyl-prolyl 
cis-trans isomerase (Pfam: projsomerase.hmm, score: 23.86 and 28.41 
[Arabidopsis thalianal Length = 281 


684 


2028684 


3' Pkc Phospho Sited 1-13) 


btSo 




^' Pkr Phn«inhn 5^itp^47-49^ 


bob 




9F-1Q >nilfi'^9941 1 1rpflNP 012485 1 1MTR4I RNA helicase; Mtr4p 
>nil1'^S9QftnkniP47047lMTR4 YEAST ATP-DEPENDENT RNA HELICASE 
DOBI nvlRNA TRANSPORT REGULATOR MTR4) >gi|1078374|pir||S56822 SKI2 
nrotein homoloa YJL050w - yeast (Saccharomyces cerevisiae) 
>qi|1008185|emb|CAA89341| (Z49325) ORF YJLOSOw 


687 


2028687 


5' Tyr Phospho Site{622-629) 


688 


2028688 


5' Rgd(156-158) 


Doy 


9n98ft8Q 


Pkr Phn<?nho Slte^29-311 






Tvr Phn<?nho Site^350-356^ 


691 


2028691 


1E-14 >gi|3834312 (AC005679) Strong similarity to glycoprotein EP1 
nhll ifiQft'^ Daucus carota and a member of S locus glycoprotein family 
PFI00954 ESTs ablAA067487, qb|Z35737. gb|Z30815, gb|Z35350, 
qb|AA713171, gb|AI100553, gb|Z34248, gb|AA728536, gb|Z30816 an... Length 




2028692 


Pkc Phospho Site(2-4) 


\JyJ\J 


2028693 


6E-28 >gil41 02703 (AF015274) ribulose-5-phosphate-3-epimerase 
[Arabidopsis thalianal Length = 281 


fiQ4 


2028694 


Tyr Phospho Site(295-303) 




2028695 


Tyr Phospho Site(790-796) 




2028696 


Tyr Phospho Site(151-158) 


697 


2028697 


Pkc Phospho Site(26-28) 


698 


2028698 


1 E-59 >emb|CAA74372| (Y14044) geranylgeranyl reductase [Arabidopsis 
thaliana] Length - 472 


699 


2028699 


Tyr Phospho Site{823-830) 


700 


2028700 


Tyr Phospho Site(1 59-166) 


701 


2028701 


2E-13 >gi|4249409 (AC006072) sugar transporter [Arabidopsis 



68 







thaliana] Length = 348 


702 


2028702 


8E-76 >emblCAB3861 1 .1 1 (AL035656) extensin-like protein [Arabidopsis 
thaliana] Length = 448 


703 


2028703 


6E-83 ) >sp|P29513|TBB5_ARATH TUBULIN BETA-5 CHAIN 
>gi|320186|pir||JQ1589 tubulin beta-5 chain - Arabidopsis thaliana >gi(166902 
{M84702) beta-5 tubulin [Arabidopsis thaliana] Length = 449 


704 


2028704 


3' Tyr_Phospho_Site(41 8-424) 


705 


2028705 


3' 4E-39 >ail41 03987 (AF030516) 5 10-methvlenetetrahvdrofolate 
dphvriroapnasp-5 10-methenvltetrahvdrofolate cvclohvdrolase rPisum sativuml 
>ail6002383lemblCAB56756 11 (AJ011589) 5 1 0-methvlenetetrahvdrofolate 
dehydrogenase: 5,10-methenyItetrahydrofoIate cyclohydrolase [Pisum sativum] 
Length = 294 


706 


2028706 


3' Tyr Phospho Site(470-478) 


707 


2028707 


5' Pkc Phospho Site(35-37) 


708 


2028708 


5' Pkc _Phospho_ Sited 8-20) 


709 


2028709 


5' Pkc PhosDho Site(236-2381 


710 


2028710 


5' Pkc PhosDho Site(7-9^ 


711 


2028711 


5' 6E-43 >ail6006879lablAAF00654 1IAC008153 6 fAC008153) eukarvotic 
translation initiation factor 3 subunit [Arabidopsis thaliana] Length = 294 


712 


2028712 


5' 2E-61 >gi| 1750376 (U80808) ubiquitin activating enzyme 
[Arabidopsis thaliana] >gi|31 50409 (AC004165) ubiquitin activating enzyme 
{UBA1) [Arabidopsis thaliana] Length = 1080 


/ 1 o 






714 


2028714 


5' Pkc _Phospho _Site( 1 86-1 88) 






'^P 9Q *>niI'^Cl1 A1Q1 IcnlP'^^RR'^ftlOr^TI RAT 1 ir^P-NI- 

o oci-,^y -^gi|oy 1 *t 1 y 1 [spj"ODOoo[w i r\Mi uur-iN- 
ACETYLGLUCOSAMINE— PEPTIDE N- 

ACETYLGLUCOSAMINYLTRANSFERASE 110 KD SUBUNIT (0-GLCNAC 
TRANSFERASE P1 10 SUBUNIT) >gil1931579 (U76557) 0-GlcNAc transferase, 
p110 subunit [Rattus norvegicus] Length = 1036 


716 


2028716 


5' 3E-71 >gi|5931694|emb|CAB56597.1| (Y18470) Exportini (XP01) protein 
[Arabidopsis thaliana] Length = 1075 


717 


2028717 


Tyr_Phospho_S!te(450-458) 


718 


2028718 


5E-43 >pirj|S581 18 thioredoxin - Arabidopsis thaliana 
>gi|1388076 (U35640) thioredoxin h [Arabidopsis thaliana] Length = 118 


71 Q 




QP-A*^ '>nil^9R7fi77 /^AP^^'^Q7Q^ ^onfai'no cimNoriKf tr\ trancrrinttrin 

yc-'tv) •^y\\0£.o I xj f ( \^r\ouuoy/yy ooniduio oiiiiiianty lu uciiioLriipuuii 
fartnr ^TINIY'i i<:;nlnn Tn^>On4 99 nhl9nfi?174 frnm A thaliana RAO nhIAnnnifi4^ 
rArahiHnriQiQ thalianal 1 f^nnth = 144 


720 


2028720 


2E-11 >emblCAB45279 11 fAL079313) hvoothetical orotein similar to 
(M97204) goliath protein [Drosophila melanogaster] [Homo sapiens] Length = 104 


721 


2028721 


1 E-94 >ablAAD20931 1 ^AC0Q6234\ diacvialvcerol l<inase FArabidonsk 
thaliana] Length = 493 


722 


2028722 


Tyr Phospho Site(688-695) 


723 


2028723 


Pkc Phospho Site(45-47) 


724 


2028724 


Tyr_ Phospho_ Site(303-31 1 ) 


79c: 
( £,0 


9n9ft79^ 


It- 1 f o ■^gii^^zu^oD ^AOuuDuoyj oeia- 1 jO-giucanaoe [rvraDiaopsis 
thaliana] Length = 439 


79R 




^F-'^P ><;nlP'^41?4lPR^ft niCni PROTFASF RFRtJI ATORY J^IJRIJMIT 

8 (TAT-BINDING PROTEIN HOMOLOG 10) >gi|422297|pirl|JN0610 probable 
transcription factor DdTBPIO - slime mold (Dictyostelium discoideum) (fragment) 
>gi|290057 (LI 6579) H1V1 TAT-binding protein [Dictyostelium discoideum] 
Length = 389 


727 


2028727 


7E-86 >gb|AAD25787.1 |AC006577_23 (AC006577) Similar to gi|1653162 
(p)ppGpp 3-pyrophosphohydrolase from Synechocystis sp genome gb|D9091 1 . 



69 







EST nhl\A/4.'^ftn7 r^nrnf^Q from thiQ n^ino rArahiHoncic thoMon^al 1 ^nn+h — T'tc 
i—v-f 1 yuivv+ouv// uui 1 ic;o IIUMI Li Mo yol lo. [Mi aUIUUpblo LrictildriaJ LSnylD — r ID 


728 


2028728 


3E-13 >aii3420745 /AF^7Q445^ TinP minh/nctolinm HicrniHonml 1 iarinth 

= 3848 


729 


2028729 


3' 2E-16 >ail4538906iemhlCAR'^Qfi4'^ 1 1 (A\ 04Q4ft9\ rhnlinpi kinaeo rimnk'On 
like protein [Arabidopsis thaliana] Lengtin = 346 


730 


2028730 


3' Pkc_Phospho Site(64-66) 


731 


2028731 


3' Pkc^Phospho Site(114-116) 


732 


2028732 


3' Tvr Phosnho Sitp^227-9'^4\ 


733 


2028733 


3' Rgd(568-570) 


734 


2028734 


3' Tvr Phn<;nhn <^itpM'^-9n^ 


735 


2028735 


T Tvr PhnQnhn Qit(aM79 iftn\ 


736 


2028736 


\j tt- u*t ^yij^ 1 -tou I ojjjii ||/-vu/ DOit. iiurncutic proisin dcli -AraDlQOpsiS 
thaliana >nli11 995'^'H /'U'^QQ44^ RFl 1 1 rAr?»hirlnnciQ thalionQl I onnth - Rin 


737 


2028737 


5' 2E-21 >ninQ1 9Q1 7InhlAAP7ftRQ'^ i 1 ^AF^^i7^ft^ MAk' liU-o eor/thr rkr^+^-Ni^ 
kinase FArabidoDsis thaliana! Lpnath = 707 


738 


2028738 


5' Pkc Phosnho Site^3-5^ 


739 


2028739 


5' Tvr Phospho Site(301-309) 


740 


2028740 


Pkc Phosnho Sltef6Q-71'i 


741 


2028741 


Pkc Phospho Site(38-40) 


742 


2028742 


Tyr_Phospho Site(478-485) 




^\J£.0 f HO 


ntvc r^nospno otie(z-4} 


744 


2028744 


1 E-31 >emb|CAA1 6524. 1 1 (AL021 633) DNA topoisomerase like-protein 

[Mictuiuupoio iiiaiidridj Lenyin — i i f y 


745 


2028745 


1 E-71 ) >gi|23471 91 (AC002338) DNA binding protein Isolog 
[Arabidopsis thaliana] >gi|31 50397 (AC004165) DNA-binding protein 
[Arabidopsis thaliana] Length = 393 


746 


2028746 


2E-80 >gi|3377808 (AF075597) contains similarity to Nicotiana alata 
pistil extensin-like protein (GB:U45958) [Arabidopsis thalianal Length = 165 


747 


2028747 


1 E-33 >sp|P54888|P5C2_ARATH DELTA 1-PYRROLiNE-5-CARBOXYLATE 
bYN 1 Ht 1 AbE B (P5CS B) [INCLUDES: GLUTAMATE 5-KINASE (GAMMA- 
GLUTAMYL KINASE) (GK); GAMMA-GLUTAMYL PHOSPHATE REDUCTASE 

(bPK) (CpLUTAMATE-5-SEMIALDEHYDE DEHYDROGENASE) (GLUTAMYL- 
GAMMA-SEMIALDE... >gi|887388|emb|CAA60447| (X86778) pyrroline-5- 
carDoxyiaie syntnetase b [AraDidopsis thaliana] >gi|1669658|emb|CAA70527| 
(Y09355) pyrroline-5-carboxlyate synthetase [Arabidopsis thaliana] Length = 726 


748 


P09874R 


zc-oH- ^ernD[uMD'tooi> i . 1 1 {RLVouZoZ) DerDerine Dridge enzyme-like 
protein [Arabidopsis thaliana] Length = 530 


74Q 


9n9a74Q 


on.-^f '^gDiAAUoyzon .1 |AOUU/^o/D_4 (AOUU/o7b) initiation factor 5A-4 

[Ml ciUiUu{Joio Uldlldlldj UciiyLil — 1 OO 


750 


£.\J^\J t \J\J 


oc-Ho -^gi|oy^ lozz (Aruozyio; transcription tactor [Arabidopsis 

thflliflnal 1 pnnth =: *?AQ 
u ictiiat laj lyii 1 — ^*+C7 


751 


2028751 


oc: 13 -'tJiiiujv-'MD lu^oy. 1 1 ^z.y/ oof ) nyaroxyproiine-ricn glycoprotein 
homolog [Arabidopsis thaliana] Length = 507 


752 


2028752 


Tyr^Phospho Site(757-764) 


753 




1 yi r IIUo|JIIU OUcy^O \ \j~0£.Z.) 


754 


2028754 


3' Tyr^Phospho Site(427-434) 


755 


90937'^'=; 


o 1 yr rnospno oiie^/ou-/oc>) 


756 


2028756 


3' 8E-32 >gi|1 076534|pir||A55333 monodehydroascorbate reductase (NADH) 

fEC 1.6.5.4) - aarden oea >ail497120 ^U064fi1) mnnnHphvHm3c;rnrhatp 

reductase [Pisum sativuml Length = 433 


757 


2028757 


5' 1E-16 >gi|3337095|dbj|BAA31843| (AB016206) polygalacturonase inhibitor 
(PGIP) rCitrus iyo] Length = 327 


758 


2028758 


5' 4E-42>gi|4586249|emb|CAB40990.1| (AL049640) pollen surface protein 
[Arabidopsis thaliana] Length = 403 



70 



759 


2028759 


5' Pkc Phospho Site(5-7) 


760 


2028760 


5' Tyr Phospho Site{560-566) 


761 


2028761 


2E-40 >emblCAA76178 1 1 ^Ylfi'^97^ rvrlir nnrlontirio ramit-itQ/^ ir^r, 

channel fArabidoosis thalianal Lpnnth = 71R 


762 


2028762 


Tyr Phospho Sited 78-1 85) 


763 


2028763 


7E-12 >dbilBAA13831 1 (DRQIfiQ'i Qimil^ir in ^arrharnm\/r-Qo ^-aro,^/io;or^ 
SCD6 protein, SWISS-PROT Accession NIumbGr P4^Q7ft r<^rhi7nQannharrkm\/^£io 
ponnbe] Length = 370 


764 


2028764 


Rgd(288-290) 


765 


2028765 


Tyr Phospho Site(21-27) 


766 


2028766 


Tyr Phospho Site(722-729) 


767 


2028767 


Tyr Phospho Site(1 033-1 039) 


768 


2028768 


Pkc_Phospho Site(45-47) 


769 


2028769 


yi|-Tv^ww^ \f\r\j£.^ooo) vcoiuic-dbbociaiea memDrane protein 
7B; synaptobrevin 7B rArabldoDsis thalianal 1 pnnth = 91Q 


770 


2028770 


3E-82 >emb|CAA1 03201 (AJ131205) mitochondrial NAD-dependent 
malate dehvdroaenase FArabidoDsis thalianal I pnnth = "^41 


771 


2028771 


Pkc_Phospho Site(277-279) 


772 


2028772 


Pkc PhosDho SiteM3-15) 


773 


2028773 


Pkc Phospho Sited 68-1 70) 


774 


2028774 


^"-^ ^1 1 luivxy-vtjvjuu^^. i{ ^Mju 1 lyjHH^j uysteine syninase [AraDioopsis 
thalianal Length = 176 


775 


2028775 


1 E-27 >pir||S65071 cystatin - field nnustard >gi|762785 (L41 355) 
cysteine proteinase inhibitor fBrassica campestris] Length = 199 


776 


2028776 


6E-62 >gi|3201633 (AC004669) cell division protein [Arabidopsis 
thalianal Length = 695 


777 


2028777 


5E-81 ) >sp|P25069|CAL2_ARATH CALMODULIN-2/3/5 
>gi|99671|pir||S22503 calmodulin - Arabidopsis thaliana >gi|1076437|pir||S53006 
calmodulin - leaf mustard >gi|2146726|pir||S71 51 3 calmodulin - Arabidopsis 
thaliana >gi|166651 (M38380) calmodulin-2 [Arabidopsis thaliana] >gi|166653 
(M73711) calmodulin-3 [Arabidopsis thaliana] >gi|474183|emb|CAA47690| 

CI c) caimocuiin lAraDiuopsis tnalianaj >gi|4y7992 (U10150) calmodulin 
[Brassica napus] >gi|899058 (M88307) calmodulin [Brassica juncea] 
^gij 1 ioouuo|apj|DAAUozoo| (U4oo4o; calmodulin [Arabidopsis thaliana] 
'-yipH-u^/ uo \M\^KJUH^oi) uRKnown proiein [AraDiuopsis thaliana] >gi|3885333 

/ACOOBfi^r^^ ralmnHirlin FArahiHnncic fhalionol '^#nil00Qyin7i.^»-fiM onocon a 

\r\\y\j\jsj\j£,^f oaii 1 luuuiii t [Mf duiuopbis inaiianaj '^gi|^^o4U/ |prT||loUoo^OA 
calmodulin 2 rArabidon<5iq thalianal 1 pnnth = ^AQ 


778 


2028778 


1E-13 >embiCAA1 85001 (A\ n'?9'^7*^\ ^yl\/n h/no franor^rii^firtn fr^^t^.- 

1 1_ ^11 iiy|w/-vn tuvFuu] \/-\L.\j^4LOf o) iviyo-iype iranscripiion Tactor 
[Arabidopsis thaliana] Length = 272 


779 


2028779 


3* Pkc Phospho Site(37-39) 


780 


2028780 


5' Pkc Phospho Site(59-61) 


781 


2028781 


Tyr Phospho Site(305-312) 


782 


2028782 


Tyr Phospho Sitef2-9) 


783 


2028783 


Pkc Phospho Site(63-65) 


784 


2028784 


Pkc Phospho Site(87-89) 


785 


2028785 


Tyr PhosDho Sitef412-419^ 


786 


2028786 


4E-39 >gb|AAD46410.1|AF096260_1 (AF096260) ER66 protein [Lycopersicon 
esculentuml Lencth = 558 


787 


2028787 


Pkc Phospho Site{21-23) 


788 


2028788 


Pkc Phospho Site(24-26) 


789 


2028789 


Tyr_Phospho Site(68-75) 


790 


2028790 


3' 4E-27>gi|4678261|emb|CAB41 122.11 {AL049657) proteasome regulatory 
subunit [Arabidopsis thalianal Length = 406 


791 1 2028791 


3* Pkc Phospho Sited 29-1 31) 



71 



792 


2028792 


5' Tyr Phospho Site(6-12) 


793 


2028793 


5' 9E-27 >gi|4914387|gb|AAD32922.1|AC007167_4 (AC007167) heat-shock 
protein fArabidoosis thalianal Lenath = 780 


794 


2028794 


Serpin(21 0-220) 


795 


2028795 


Tyr Phospho Site(327-334) 


796 


2028796 


Pkc Phosoho SitpnS-'^7'V 


797 


2028797 


1E-45 >ail40Q'^1 'S'S ^AFn88981\ nhx/inr^hmmo acoi^oio+iarl n»r#-if£Nir> ^ 

FArabidODsis thalianal Lpnnth = 9fi7 


798 


2028798 


3E-51 >gb|AAD25794.1 |AC006550_2 (AC006550) Similar to gb|U51990 pre- 

mRNA-SDiicina factor hPrn18 frnm Hnmn <5anipn<; F^Tc nhlTAR'^Qi anW 
gb|AA721815 come from this gene. [Arabidopsis thalianal Length = 420 


799 


2028799 


Pkc Phosnho SiteHI-l'^^ 


800 


2028800 


Tyr_Phospho Site(202-209) 


801 


2028801 


oc HO ^eniujortDoyo/ y. ij ^ALU4y4oo; Deta-gaiactosidase [Arabiacpsus 
thalianal Lpnnth = 79Q 


802 


2028802 


-"CI 1 lujv-r/A/n I otoo. Ij \r\Ljj£itLOH I ) sennc/inreonine Kinase-iiKe protein 
fArabidoosis thalianal Lpnnth = fi'^'^ 


803 


2028803 


1 E-74 >gi|3044218 (AF057144) signal peptidase [Arabidopsis thalianal 
Lenqth = 1 67 


804 


2028804 


Tyr_Phospho Site(707-715) 


805 


2028805 


Pkc Phospho Site(22-24) 


806 


2028806 


Tvr Pho^inhn ^itpn9'^-'^'^9^ 


807 


2028807 


5E-65 >emb|CAB1 6773.1 1 {Z99707) Cu2+-transporting ATPase-like protein 
[Ml ctuiuupbib Lrioifdnaj uenytn — o ly 


808 


2028808 


8E-63 >gb|AAD1 7333| (API 25574) lysyl-tRNA synthetase; LysRS 
[Arabidopsis thaliana] >gi|6041823|gb|AAF02138.1|AC009918 10 (AC009918) 
lysyl-tRNA synthetase [Arabidopsis thalianal Length = 626 


809 


2028809 


2E-56 >gl|2909781 (AF020288) MgATP-energized glutathione S- 
conjugate pump [Arabidopsis thalianal Lenqth = 1623 


810 


2028810 


3' Tvr Pho<5nho 5;itp^74Q-7'Sfi^ 


811 


202881 1 


3' 2E-20 >gi|1655424|dbj|BAA1 19441 (D83531) GDP dissociation inhibitor 
[Arabidopsis thaliana] >gi|3212878 (AC004005) GDP dissociation inhibitor 
[Arabidopsis thaliana] Length = 445 


812 


2028812 




813 


2028813 


y 6E-15 >gi|4325346|gb|AAD1 7345.1 1 (AF128393) similar to N- 
ethylmaleimide sensitive fusion proteins; contains similarity to ATPases (Pfam: 
ri v/\juu*+, oouic? o\jt ,t , c:— 1 .*tc?-ooii IN — 1 ; [Mraoiaopsis inauanaj Lengtn — ( (Z 


814 


2028814 


3' Rgd(690-692) 


815 


2028815 


5* 1 E-33 >gi 12905657 ( AF047469) arsen ite translocating ATPase 
[Homo sapiens] Length = 348 


816 


2028816 


5* Tyr_Phospho_Site(41 7-424) 


817 


2028817 


5' 5E-44>gi|5929906|gb|AAD56636.1|AF162150 1 (AF162150) COP1- 
interacting protein CIP8 [Arabidopsis thaliana] Length = 334 


818 


2028818 


Tvr Phnc;nhn *^itfi/'R^J.-8fi9\ 


819 


2028819 


Tyr Phospho Site(556-564) 


820 


2028890 


Pifr Phncnho Qifo/'l'^ 

1 "iiuspno oiiey^io-iOj 


821 


2028821 


4E-35 >sp|P49688|RS2 ARATH 40S RIBOSOMAL PROTEIN S2 
>gi|2335095 (AC002339) 40S ribosomai protein S2 [Arabidopsis thaliana] Length 
= 285 


822 


2028822 


6E-23 >ref|NP_004862.1 |PG0SR1 1 golgi SNAP receptor complex member 1 
>gi|4234774 (AF073926) cis-Golgi SNARE p28 fHomo sapiens] Length = 250 


823 


2028823 


Tyr_Phospho Site{409-416) 



72 



824 


2028824 


Pkc_Phospho Site(2-4) 






oc-DD ; •^emD|UMD lo/yD. 1 1 (^v^fV/) MAroK-liKe protein kinase 
[Arabidopsis thaliana] Length = 799 


826 


2028826 


7E-71 >ennb|CAB1 0557.1 1 (Z97344) trehalose-6-phosphate synthase like 

piuicrill [MI aUIUUpblb UldlidlldJ Lcnyin — uDO 


827 


2028827 


1E-139 >gi|2262167 (AC002329) cytosolic ribosomal protein S4 
[Arabidopsis thaiiana] Length = 261 


828 


2028828 


4E-12 >gi|3327957 (AF060490) TLS-associated protein TASR-2 [Mus 
musculus] >gi|3327976 (AF067730) TLS-associated protein TASR-2 [Homo 
sapiensj Lengtn — 


829 


2028829 


2E-30 >pir||S59544 stress-induced protein 0ZI1 precursor - Arabidopsis 
manana >gi|/^yuooo [UZvo4f) mKNA corresponding to this gene accumulates in 
response to ozone stress and pathogen (bacterial) infection; pathogenesis- 
ic?idit;u proiein [Mraoiaopsis inaiianaj >gi|z^ozoDy (ArUlozy4) No aetinition line 
found [Arabidopsis thaliana] Length = 80 


830 


20288*^0 


vc-Hf ^uuj|DMMZHoy4{ \uooZ\jo) proiein Kinase [AraDioopsis tnaliana] 
Lpnnth = 49fi 


831 


2028831 


Pkc_Phospho Site(8-10) 


832 


2028832 


Tyr_Phospho Site(58-64) 






o It- [4 ->gi|4uz/oyo (Ai-U4yoo2) alpha-expansin precursor 
[iNiouudiid [dudcurnj Lengin — zo/ 


834 


20288^4 




835 


2028835 


5' 8E-44 >gi|484656|pir||JU0182 monodehydroascorbate reductase (NADH) 
(EC 1.6.5.4) - cucumber >gi|452165|dbj|BAA05408| (D26392) 
monoaenyaroascorDaie reauciase [oucumis sativusj Lengtli — 434 


836 


2028836 


Tyr_Phospho Site(41 9-426) 






T\/r Phncnhn Qito/f^7Q c;J5f^\ 

1 yi riiobpno oiie\^D/ y-ooDy 


838 


2028838 


3E-32 >sp|Q45223|HBD_BRAJA 3-HYDROXYBUTYRYL-COA 

ut:l-lYUKUtjbNAob (bb 1 A-H YDKOXYBUTYKYL-COA DEHYDROGENASE^) 
(BHBD) >gi|1209052 (U32229) HbdA fBradyrhizobium japonicuml Length = 293 


oot? 




ib-i4 >gi|o4bio4U (AC005315) reverse transcnptase [Arabidopsis 
uidiidndj Lengin — lozy 


ft4n 

0*tU 




■1 C H c ■v.y-iu; ID A A7CC5Q^ -11 /ADAH "7Cf^o^ r i tk i* x- 

1 t-iD >aDj|DAA/^ooo4.l 1 (AB017d93) transcnption factor [Nicotiana 
tabacum] Length = 291 


841 


2028841 


8E-66 >gi|21 60694 (U73528) B^ regulatory subunit of PP2A [Arabidopsis 
thalianal Length = 522 


842 


2028842 


Tyr Phospho Site(1 94-200) 


843 


2028843 


Pkc_Phospho Site(28-30) 


o44 


^U2oo44 


3 2E-23 >gi|2129770|pir||S71224 xyloglucan endotransglycosylase-related 
protein XTR-2 - Arabidopsis thaliana >gi| 1244756 (U43487) xyloglucan 
enaoiransgiycosyiase-reiaiea protein [Arabidopsis thaliana] 
>gi|2154611|dbj|BAA20290| (D63510) endoxyloglucan transferase related protein 
[Arabidopsis thaliana] >gi|5533311|gb|AAD45124.1|AF163820J (AF163820) 
encoxyiogiucan transTerase [AraDidopsis thaliana] Length = 332 






O rKC rnospno olTe(4/-44) 




9n98ft4R 


o 1 yr rnospno oite(ioz-iDU) 


847 


2028847 


3' 8E-13 >gi|1 076421 |pir||S46523 transcription factor TGA3 - Arabidopsis 
inaiiana ->gi[ou4i lo (LiUiiuyj transcription tactor [Arabidopsis thaliana] Length = 
384 


848 


2028848 


3' Pkc Phospho Site(2-4) 


849 


2028849 


5' Tyr Phospho Site(764-771) 


850 


2028850 


2E-65 >emb|CAA67338| (X98806) peroxidase ATP20a [Arabidopsis 
thaliana] Length = 330 


851 


2028851 


3E-99 >emb|CAB45075.1 1 (AL078637) serine/threonine kinase-like protein 
[Arabidopsis thaliana] Length = 445 



73 



852 


2028852 


4E-71 >emb|CAB1 06981 (Z97558) argininosuccinate lyase [Arabidopsis 

thalif^nal 1 pnnth = 'SIT 


853 


2028853 


>ail541880inir!lS4?n8R mPVPlnnatp kinj:»c;p iV^C 0 1 A _ ArahiHnncrc thalinno 

>gi|4566141emb|CAA54820| (X77793) nnevalonate kinase [Arabidopsis thaliana] 
>ai 4883990lablAAD31719 1 1AF141853 1 rAF1418'S'^^ mpvalnnptp kinp<ip 
[Arabidopsis thaliana] Length = 378 


854 


2028854 


Tyr_Phospho Site(53-60) 


855 


2028855 


Pkc_Phospho Site(62-64) 


856 


2028856 


2E-17 >gi|1899188 {U90212) DNA binding protein ACBF [Nicotiana 

t?ihanim1 1 pnnth = 49ft 


857 


2028857 


Tyr_Phospho Site{364-371) 


858 




OXIDOREDUCTASE 40 KD SUBUNIT PRECURSOR (COMPLEX I-40KD) (CI- 
tur\L/^ '^yi| lu iouo|[jii[jo iouz:o iNMun uciiyurogenase ^UDiquinone^ \to l.o.o.o^ 
40K chain - Neuro^^nora f:ra«i«;a >nil'^n4filpmhinAA'^Q 


859 


2028859 


7E-25 >ail21Q1 I'SO ^AFnn79fiQ'^ Qimilar tn mitnrhnnrlnal rarrior f am fK/ 

FArabidoDsis thaliana! Lpnath = 


860 


2028860 


3' 8E-21 >ail4218120lemblCAA22Q74 1! ^'^^'^^Vi Prnlinp-rirh APn likp 
orotein FArabidoDsis thalianal Lenath = 367 


861 


2028861 


3' Tvr PhosDho Sitef684-690'J 


862 


2028862 


3' Pkc PhosDho Sitef49-51) 


863 


2028863 


3' Tyr Phospho Site(485-493) 


864 


2028864 


3' Wd Repeats{436-450) 


865 


2028865 


3' Pkc PhosDho Site^50-52\ 


866 


2028866 


3' Pkc PhosDho Site(23-25^ 


867 


2028867 


3' Pkc Phospho Site(2-4) 


868 


2028868 


3' Pkc Phospho Site(5-7) 






o oc-DD ->gi|Zoz/ f uo|ernD|uAAiDoc) 1 1 (ALU21oo4) myD - related protein 

[AM CIUlULf|JOlO LIlClIIClllClJ t_OliyLIl — OlH 


870 


2028870 


5' Pkc Phospho Sited 01-1 03) 


871 

\J 1 i 




o /c-zz ->gi|ozz/^oz|pir||A44zzo auxin-inaepenaent growth promoter - 
iMiuuitaiici lauduuiii •^yijooyyz 1 jernD[OMA\ODOf Uj ^^AOUoU I j axi 1 [iNlCOtiana 

tabacum] Length = 569 


872 


2028872 


Pkc Phncinhn S^\M9f\-9R\ 


873 


2028873 


4E-40 >gi|2435517 (AF024504) contains similarity to peptidase famiiy 
A1 [Arabidopsis thaliana] Length = 472 


874 


2028874 


Pkc Phospho Site(47-49) 


875 


2028875 


6E-43 >emb|CAB43855.1 1 (AL078465) isp4 like protein [Arabidopsis 
inaiianaj Lengin — /oo 


876 


2028876 


2E-53 >sp|P54641 |VATX_DICDI VACUOLAR ATP SYNTHASE SUBUNIT 

AOoy ^V-AI KAob Auoy oUbUNI 1 ) (41 KU AOUbooOKY PROTEIN) (DVA41) 
>gi|626048|pir||A55016 lysosomal membrane protein DVA41 - slime mold 
(Dictyostelium discoideum) >gi|532733 (U13150) vacuolar ATPase subunit 
DVA41 [Dictyosteli 


877 


2028877 


1 E-36 >emb|CAA1 8734.1 1 (AL022604) cysteine proteinase-Iike protein 
[AraDioopsis inaiianaj Lengm — ooo 


878 


2028878 


8E-12 >pir||S71365 AP2 domain-containing protein - Arabidopsis 

thaliana >ail 1209099 (U40256^ AiMTFGlJMFNITA rArahirinn«:iQ thalianal 

>gi|1244708 (U41339) ANT [Arabidopsis thaliana] 
>gi|4490720|emb|CAB38923.1| (AL035709) ovule development protein 
aintegumenta (ANT) [Arabidopsis thaliana] Length = 555 


879 


2028879 


2E-60 >gi|3738302 (AC005309) tubby-like protein [Arabidopsis thaliana] 
>gi|4249398 (AC006072) tubby protein [Arabidopsis thaliana] Length = 407 



74 



880 


2028880 


Pkc_Phospho_Site{29-31 ) 


881 


2028881 


1 E-67 ) >emb|CAA1 6700.1 1 (AL021687) kinase-like protein [Arabidopsis 
thaliana] Length = 290 


882 


2028882 


Pkc_Phospho_Site(2-4) 


883 


2028883 


1 E-23 >emb|CAB40952.1 1 {AL049638) C-4 sterol methyl oxidase 

FArabidoDsis thaiianal Lpnnth = '^0'^ 


884 


2028884 


3' Pkc Phosnhn SiW2V?^'» 


885 


2028885 


3' Pkc Phosnho SiW11-1'^^ 


886 


2028886 


o / 1- 1 £. -^yii^^ 1 uoo^-ojyu|/nMijoyuuo. 1 |Mv^uu f oy 1 zo ^Mouu/oyiy oimiiar to 

ail22113 Ac transoosase fORFa^ from 7pa mpvQ traneirrint nhwn^AOA 
[Arabidopsis thaliana] Length = 799 


887 


2028887 


3' Pkc Phospho Site(116-1 18) 


888 


2028888 


3' Tvr PhosDho Sitef532-539^ 


889 


2028889 


5* Pkc PhosDho Site(61-63^ 


890 


2028890 


5' Pkc Phospho Sited 37-1 39) 


891 


2028891 


5' Pkc Phospho Site(26-28) 


892 


2028892 


5' Pkr Phf><inhn Ritp^74-7fi\ 


893 


2028893 


5' Tyr Phospho Site(604-610) 


894 


2028894 




895 


2028895 


8E-51 >gi| 1336084 (U56635) Arabidopsis thaliana giutamate 

ucjiiyuiuyei idot? ^vjun^dj rrir\iN/-\, compieie COS. [AraDioopsis tnaiianaj Lengtn = 
411 


896 


2028896 


''yiiooouoou \t\\^\j\joo^o) recepior-iiKe proiein Kinase 
[Arabidopsis thaliana] Length = 1007 


897 


2028897 


^t- o 1 ^fjii ||Ovjyooo \j \ r-uniuing proiein, oor\ - Araoioopsis tnaiiana 

>nilftn7^77 (\ f^TP-hinHinn nrn+Ckin FArciKi/Hrineio th'ali'ar»'3i1 1 — C-in 

-^yiiuu/ ij/ / ^L,oou \ H) \ji r-uu luiiiy piuLt;iii [MidDiaopsis inaiianaj Lengin — oiu 


898 


2028898 


Pkc_Phospho Site(7-9) 


899 


2028899 


2E-51 >gi|2231 1 75 (U44050) mis5p [Xenopus laevis] Length = 796 


900 


?n98Qnn 


Htz-^^- ^emD|UMDO/ooD. 1 1 (Aj/i4oy/z) o-pnospnogluconolactonase [Homo 

Oayici loj l_ciiyiM ~ ZoO 


901 


2028901 

£-\J \J\J 1 




902 


2028902 


2E-80 >gb|AAD25843.1|AC006951_22 (AC006951) acyl-CoA synthetase 

j^/ni duiuupolo LildlldllclJ '^yi|HOOy'fDy [yUjMMUZ f yUO. 1 |MOUU / Z 1 O O (AOUU/zIo) 

acyl-CoA synthetase [Arabidopsis thaliana] Length = 720 


903 


2028903 


Pkc Phospho Sited 2-1 4) 






ri\o nnospno oite(DZ-04j 


905 


2028905 


1 E-1 00 >pir||S59558 GTP-binding protein, 68K - Arabidopsis thaliana 
>gi|807577 (L38614) GTP-binding protein [Arabidopsis thaliana] Length = 610 


906 


2028906 


5E-91 >gi|1 773295 (U76707) regulatory protein NPR1 [Arabidopsis 
thaliana] >gi|1916912 (U87794) transcription factor inhibitor 1 kappa B homolog 
[Arabidopsis thaliana] Length = 593 


907 


2028907 


Tyr_Phospho_Site(812-819) 


908 


2028908 


3E-48 >gi|1 750376 (U80808) ubiquitin activating enzyme [Arabidopsis 
thaliana] >gi|31 50409 (AC004165) ubiquitin activating enzyme (UBA1) 
[Mrduiaopsis inaiianaj Lengin — lUou 


909 


2028909 


3E-1 7 >gi|2924793 (AC002334) similar to synaptobrevin [Arabidopsis 
thaliana] Length =212 


910 


2028910 


3E-27 >pir||S71284 MYB-related protein 33,3K - Arabidopsis thaliana 
>gi|1263095|emb|CAA90809| (Z54136) iVIYB-related protein [Arabidopsis 
thaliana] Length = 305 


911 


2028911 


Tyr_Phospho_Site(91 -99) 


912 


2028912 


4E-37 >gb|AAD23951.1|AF093108 1 (AF093108) histone H3 fTortula ruralis] 



75 







ipnnth = 117 


913 


2028913 


Tyr_Phospho_Site(1 497-1 504) 


914 


2028914 


3E-17 >gb|AAD48636.1 |AF165924J (AF165924) auxin-induced basic tielix- 
loop-helix transcription factor [Gossypium hirsutum] Lengtli = 314 


915 


2028915 


Pkc_Phospho_Site(52-54) 






4C-DU ; -^spiUoy 1 1 i_AKA 1 r1 rKUbAoLb NAUP-UbPENDENT 
OXIDOREDUCTASE P1 >gi|1362013|pir||S57611 zeta-crystallin homolog - 

Ml aUlUU|Jold Uldlldfld •^yi|oOD'fZO|eiTlD{UAMoyt500| \Z.4y /DO) ZGla-CryStailin 

homologue [Arabidopsis thalianal Lenath = 345 


917 


2028917 


Tyr__Phospho Site(9-16) 


y 1 o 


zuzoy 1 o 


r'KC_r'nOSpnO o{te(1o-iCU) 


919 


2028919 


8E-77 >gi|24541 84 (U801 86) pyruvate dehydrogenase E1 beta subunit 
[Arabidopsis thaliana] Length = 406 


920 


2028920 


4E-28 >ennb|CAB56768.1 1 (AJ1 32096) squamosa promoter binding 
protein-like 12 [Arabidopsis thaliana] >gi|6006403|emb|CAB56769.1 1 (AJ132G97) 
squamosa promoter binding protein-like 12 [Arabidopsis thaliana] Length = 927 




ZU^oyZI 


3 dE-32 >gi|4678360|emb|CAB41 1 70.1 1 (AL049659) Cytochrome P450-like 
protein [Arabidopsis thaliana] Length =490 




zuztjyzz 


o bt-oz >gi|41b7oo|sp|P3282D|CBPX_ARATH SERINE 
UAKDUAYrtr 1 lUAot PKbUUKoUK >gi|lbDb74 (MS1 130) carboxypeptidase 
T-iiKe proiein [AraDiuopsis inaiianaj >gi|44oi^U|prT||iyuo4iibA carDoxypeptidase 
T [Mi auiuupoio xriaiianaj uengin — ooy 


923 


2028923 


3' Pl<c Phospiio Site(76-78) 




^cu^oy^H- 


o rKc_rnospno olie^ZI-^o) 


925 


2028925 


3' Tvr_Phospho Site(147-154) 




9n9RQ9ft 


o 1 yr rnospno o!ie(ou-ocsj 


iJdC f 


909^097 


o 1 yr rnospno oiTe(4/4-4oi) 


yzo 


9n9ftQ9ft 

zuzoyzo 


o Db-^^ >gi|ziy^uuo4|aDj|DAAzoioU| (Uoooobj delta 9 desaturase 
[Arabidopsis thaliana] Length = 305 


929 


2028929 


5' 5E-48 >gi 12944446 (AF050756) cysteine endopeptidase precursor 
[Kicinus cornnriunisj Lengtn - obu 


930 


2028930 


Tyr_Phosplio_Site{672-680) 


yoi 


ZUzcSyol 


lyr Pnospno Site(28-36) 


932 


2028932 


4E-23 >sp|P74707|RF1_SYNY3 PEPTIDE CHAIN RELEASE FACTOR 1 
(RF-1) >gi|1653916|dbj|BAA18826| (D90917) peptide chain release factor 
[Synechocystis sp.] Length = 365 


933 


2028933 


1 E-12 >gi|2947070 (AC002521 ) Ser/Thr protein kinase [Arabidopsis 
tnaiianaj Lengtn = 429 




zuzoyo4 


1 b-92 >gi|20621 71 (AC001 645) DNA binding protein (CDC27SH) isolog 
[AraDiaopsis tnaiianaj Lengtn ~ (\( 


935 


2028935 


7E-29 >pir||S51 938 protein kinase homolog - Arabidopsis thaliana 
>gi|717180|emb|CAA55866| (X79279) protein kinase homologous to shaggy and 
glycogen synthase kinase-3 [Arabidopsis thaliana] Length = 421 


yoD 


zuzoyoD 


PKc pnospno oite(y9-i ui ) 


yo f 


zuzoyo/ 


rKc pnospno oite(/y-oi) 


938 


2028938 


7E-21 >gii1399183 (U50739) Lycopene beta cyclase [Arabidopsis 
tnaiianaj >gi|bUob2U2|gD|AAF02819.1|AC009400_15 (AC009400) lycopene beta 
cyclase TArabidoDsis thalianal Lpnnth = 


939 


2028939 


Tyr_Phospho Site(324-331) 


940 


2028940 


3* Pkc Phospho Site(7-9) 


941 


2028941 


3' 6E-11 >gi|4115538|dbj|BAA36412| (AB012116) UDP-glycose:flavonoid 
glycosyltransferase IVigna mungo] Length = 381 


942 


2028942 


3' Tyr Phospho Site(584-591) 



76 









944 


2028944 


5' 3E-43 >gi|3912988|sp|022456|AGL9 ARATH FLORAL HOMEOTIC 
PROTEIN AGL9 >gi|2345158 (AF015552) AGL9 [Arabidopsis thaliana] 
>gi|2829878 {AC002396) AGL9 [Arabidopsis thaliana] Length = 251 






^' Pkr PhnQnhn ^ito/^^R-RO^ 
O rl\0 r IlUopflU Odc^oO OU^ 








947 


2028947 


1E-70 >sp|P41343|FENR MESCR FERREDOXIN— NADP REDUCTASE 
PRECURSOR (FNR) >gi|320548|pir||A44974 ferredoxin— NADP+ reductase (EC 

Lies, i.z; precursor - cominon ice piani >gi|io/zoD (MZodZo) Terreaoxin-iNAL)r+ 
ic;uuoidbc prtjourbor ^TnrM, i.d./. i ) [ivieseiriDryanxnennLirn crysiaiiinurnj -^giiidz 


948 


2028948 


Pkc Phospho Sited 52-154) 


QAQ 


OClOflOAO 

^uzoy^y 


4t-oz: ^goiAAO/ o44i .1 1 (uyz4DU; 1^-oxopnytoaienoate reductase UrR2 
[Arabidopsis thaliana] >gi|6143903|gb|AAF04449.1 |AC010718_18 (AC010718) 
1 ^-oxopnyioaienoaie reouciase (UrKZj [AraDiuopsis maitanaj Lengin — o^4 


950 


2028950 


Tyr_Phospho_Site(874-880) 


yo 1 


Kfiuzoyo 1 


otr-yz ->gi|oof ouu ^Aru^ ooy r ) similar to giycosyi nyoroiases Tamiiy y 
(PFann:glycosyl_hydro5.hmm, score: 100.70) [Arabidopsis thaliana] Length == 516 


952 


2028952 


2E-11 >emb|CAB56146.1 1 (AL1 17669) large secreted protein 
[Streptomyces coelicolor A3(2)] Length = 809 


953 


2028953 


1 E-1 55 >gb|AAC951 71 .1 1 {AC005970) protein kinase [Arabidopsis 

thali?in?5l 1 pnnth = 4R9 


954 


2028954 


Tyr_Phospho__Site(1 83-1 89) 


955 


2028955 


1 E-23 >gi|331 9370 (AF077409) contains similarity to C3HC4-type zinc 
fingers (Pfam: zf-C3HC4.hmm, score: 32.94) [Arabidopsis thaliana] Length = 233 


956 


2028956 


Pkc_Phospho__Site(259-261 ) 


957 


2028957 


2E-73 >gb|AAD46404.1 |AF096248_1 (AF096248) ethylene-responsive RNA 
helicase [Lycopersicon esculentum] Length = 474 


958 


2028958 


8E-1 3 >gi|3377808 (AF075597) contains similarity to Nicotiana alata 
pistil extensin-like protein (GB:U45958) [Arabidopsis thaliana] Length = 165 


959 


2028959 


3' Pkc_Phospho_Site(20-22) 


yoU 


zUzoybU 


o 5E-1o >gi|o4ooD70|reT|NP_00D339.1 |pGTC90| Golg I transport complex 
protein (yu Kua; >gi|ooUozoo (ArUoo/ lo) loo ooigi transport complex yUkU 
suDunii Drain-speciTic isoiorm [nomo sapiensj Lengtn — c>oy 


961 


2028961 


3' 2E-25 >gi|2244748|emb|CAB10171 .1 1 (Z97335) disease resistance Cf-2 like 
proicin [Arduiuopsis tnaiiansj Lengtn — ooy 






o nKO nriuspno oiio\oi-oo^ 








Qfi4 


9n98QR4 


Twr Phncnhn <^Ht:^(AO 0C)\ 


yoo 


909^0^*^ 
zuzoyoo 


ofz-oi -^emD[OAD'4DUuu. 1 1 ^z.y/oo0; seienium-Dinaing protein iiKe 
[Arabidopsis thaliana] Length = 478 
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2028966 


Pkc Phospho Site(96-98) 
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2028967 


Pkc Phospho Site(62-64) 


968 


2028968 


Pkc_Phospho Site(25-27) 


969 


2028969 


Pkc„Phospho_Site(47-49) 


970 


2028970 


5E-94 >dbj|BAA24226| (AB001 568) phospholipid hydroperoxide 
glutathione peroxidase-like protein [Arabidopsis thaliana] >gi|3004869 
(AF030132) glutathione peroxidase; ATGP1 [Arabidopsis thaliana] 
^yi|H-oov7H-o 1 [ci I lUjOMDoyyo 1 . 1 1 ^ALu*+youu/ pnospnoiipio nyoroperoxioe 
glutathione peroxidase [Arabidopsis thaliana] Length = 169 


971 


2028971 


2E-56 >sp|P10797|RBS3 ARATH RIBULOSE BISPHOSPHATE 
CARBOXYLASE SMALL CHAIN 2B PRECURSOR (RUBISCO SMALL SUBUNIT 
2B) >gi|68061 |pir||RKMUB2 ribulose-bisphosphate carboxylase (EC 4.1 .1 .39) 
small chain B2 precursor - Arabidopsis tha 
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2028972 


3E-75 >gb|AAD4 1430.1 |AC007727_1 9 (AC007727) Similar to gb|Z1 1499 protein 
disulfide isomerase from Medicago sativa. ESTs gb|AI099693, gb|R65226, 

nhIA ARJ^yil "1 ohlTA'^nftfi nhlT407R4 ny\\TAAt\(\K /-ihlT7R/l/lK nMU'iRTi'i 
yD|AAOO/ ON, gD| 1 40UDO, gD| 1 4^ f 04, gD| 1 14UU0, gD| 1 (0440, gD|noC) / OO, 

gb|T43168 and gb|T 






1 1- lUU '>Sp|UU4U 1 y|r KoA AKAIn Zoo r KU 1 bAob KboULA 1 UKY 
SUBUNIT 6A HOMOLOG (TAT-BINDING PROTEIN HOMOLOG 1) (TBP-1) 
'>gi|zo4zD/ D ^ALrUuuiUDj oimnar to proDaoie Mg-aepenaent a i rase 
(pir|S56671). ESTs gb|T46782,gb|AA04798 come from th 


Oil A 


zuzoy ^ 4 


1 t-4o -^gDiAAUouy ^ d.i |Arizn oyo_i \f\r\ ^\o^o) uoncnoi-pnospnate-mannose 
syninase [Lrnceiuius gnseusj Lengin — <<ido 


<3 t \} 




9P_ftft ">nil'^7n9'^9'1 /Ar^nrm^Q7\ Tf^C Koto riaoon+rtr inf^iror'+inn r^rr»^'£^in 
[Ml clUIUUpolo LlldllallcfJ Lc^liyUI ~ O^O 


976 


2028976 


6E-67 >gi|619745 {U18929) cytochrome p450 dependent 

1 [ lUI lUUAyijcl Idoo [Ml oUtUU|Jolo 11 Icailal laj LciiyLl) — 
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2028977 


3' Tyr Phospho Site(600-607) 


y / o 


9n9PQ7fi 


o rKc_rnospno_oiie( i f - ly; 


979 


2028979 


3' Pkc_Phospho_Site(28-30) 


you 


iiU/ioyoU 


o ib-o/ >gi|oo4oUoo|gD|AAUoDbyy| (Ar07o5ol) protein pnosphatase-2U; 
PP2C [Mesembryanthemum crystailinum] Length = 344 


981 


2028981 


5' 6E-64 >gi|2462746 (AC002292) Sinnilar to ATP-citrate-lyase 
[Arabidopsis thaliana] Length = 423 


yoz 


2u2oyo2 


5 5b-1 4 >gi 12459737 (U95375) oxidoreductase [Haloferax volcanii] 
Length = 255 


983 


2028983 


2E-19 >sp|P46689|GAS1 ARATH GIBBERELLIN-REGULATED PROTEIN 1 
PRECURSOR >gi|2129588|pir||S71441 GAST1 protein homotog (clone GASA1) - 
Arabidopsis thaliana >gi|887939 (U11766) GAST1 protein homolog [Arabidopsis 
tnaiianaj Lengtn - yo 


yo4 


^U/:oyo4 


zb-bo >gi|ooo4o12 (Au00ob79) Strong similarity to glycoprotein EF'1 
gD|LiDyoo uaucus caroia ana a merriDer or o locus glycoprotein Tamiiy 

rr|UUy04. Co l S gD|AAUO/40/, gD|il.OO/ or , gD|Z.oUO 10, gD|z.oooou, 
nhlAA71'^17i nhlAM 00'^'^'^ nhiy^^^Aft nhlAA79ft*^^R nhlT'^nftifion 1 onnfh 

yu|MM/ 1 o 1 / 1 , yu(Mi 1 uuDoo, gu|i_0'fz*+o, yu|MM/ zoooo, yu|z.ouo i o an... Lenyin 


985 


2028985 


Tyr Phospho Sited 020-1 028) 


yoo 


on9QQft(^ 

zuzoyoD 


1 yr_r nospno oiie^/oD-/y4^ 


987 


2028987 


Pkc Phospho Site(2-4) 


988 


2028988 


Tyr_Phospho Site(555-561) 


989 


2028989 


Tyr Phospho Site(10-17) 


990 


2028990 


9E-62 >gb|AAD41999.1|AC006233_10 (AC006233) NAM protein [Arabidopsis 
thaliana] Length = 335 


991 


2028991 


6E-37 >sp|P35133|UBCA ARATH UBIQUITIN-CONJUGATING ENZYME E2- 
17 KD 10 (UBIQUITIN-PROTEIN LIGASE 10) (UBIQUITIN CARRIER PROTEIN 

10) >gi|421858|pir||S32672 ubiquitin — protein ligase (EC 6.3.2.19) UBC10 - 
Arabidopsis thaliana >gi|297878|emb|CAA78715| (Z14991) ubiquitin conjugating 
enzyme [Arabidopsis thaliana] >gi|349213 (L00640) ubiquitin conjugating enzyme 
[Arabidopsis thaliana] Length = 148 


yyz 


^u^cjyyz 


^b-^o >emD|uAAiboo4.l 1 (ALOzl /49) bOrl protein-liKe protein 
[Araoiuopsis inaiianaj uengin — ^oo 


yyo 


zuz^5yy^5 


4b-4u >gD|AADyoouy. 1 1 (ALrUUoiuo) soiuDie epoxide nyaroiase 
[Arabidopsis thaliana] Length = 320 


994 


2028994 


7E-28 >gb|AAD24462.1 |AF1 18855_1 (AP1 18855) trans-prenyltransferase [Mus 

lilUoLrUlUoJ LctiyUI — OOD 


995 


2028995 


Tyr Phospho Slte(674-680) 


996 


2028996 


Pkc Phospho Site(36-38) 


997 


2028997 


9E-14 >dbj|BAA21425| (AB004537) WEB1 PROTEIN 
[Schlzosaccharomyces pombe] >gi|29505071emb|CAA17835| (AL022072) web1 
homolog; protein transport protein; WD-repeat protein [Schizosaccharomyces 
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pombe] Length = 1224 


998 


2028998 


7E-47 >emb|CAB43966.1 1 (AL078579) acyl-CoA binding protein 
[Arabidopsis thaliana] Length = 354 


999 


2028999 


2E-50 >gi|1732570 (U72153) beta-glucosidase [Arabidopsis thaliana] 
Length = 525 



79 



